Nucleic acid ligands

ABSTRACT

This invention comprises nucleic acid ligand for use as a diagnostic reagent for detecting the presence or absence of a target molecule in a sample, and a diagnostic reagent to measure the amount of a target molecule in a sample. In a preferred embodiment the nucleic acid ligands are identified by the method of the invention referred to as the Systematic Evolution of Ligands by EXponential enrichment (SELEX), wherein a candidate mixture of nucleic acids are iteratively enriched in high affinity nucleic acids and amplified by further partitioning.

[0001] This application is a continuation application of co-pending U.S.patent application Ser. No. 09/502,344 filed Feb. 10, 2000, which is acontinuation application of U.S. patent application Ser. No. 09/143,190,filed on Aug. 27, 1998, now U.S. Pat. No. 5,843,653 which is acontinuation application of co-pending U.S. patent application Ser. No.08/469,609, filed on Jun. 6, 1995, which is a continuation applicationof U.S. patent application Ser. No. 08/428,964, filed Apr. 25, 1995, nowabandoned. U.S. patent application Ser. No. 08/469,609 is also acontinuation of U.S. patent application Ser. No. 08/409,442, filed Mar.24, 1995, now U.S. Pat. No. 5,696,249. U.S. patent application Ser. No.08/469,609 is also a continuation of Ser. No. 08/412,110, filed Mar. 27,1995, now U.S. Pat. No. 5,670,637. Ser. Nos. 08/428,964, 08/409,442, and08/412,110 are continuations of U.S. patent application Ser. No.07/714,131, filed Jun. 10, 1991, now U.S. Pat. No. 5,475,096, which is aContinuation-in-Part application of U.S. patent application Ser. No.07/536,428, filed Jun. 11, 1990, now abandoned.

[0002] This work was supported by grants from the United StatesGovernment funded through the National Institutes of Health. The U.S.Government has certain rights in this invention.

FIELD OF THE INVENTION

[0003] We describe herein a new class of high-affinity nucleic acidligands that specifically bind a desired target molecule. A method ispresented for selecting a nucleic acid ligand that specifically bindsany desired target molecule. The method is termed SELEX, an acronym forSystematic Evolution of Ligands by Exponential enrichment. The method ofthe invention (SELEX) is useful to isolate a nucleic acid ligand for adesired target molecule. The nucleic acid products of the invention areuseful for any purpose to which a binding reaction may be put, forexample in assay methods, diagnostic procedures, cell sorting, asinhibitors of target molecule function, as probes, as sequesteringagents and the like. In addition, nucleic acid products of the inventioncan have catalytic activity. Target molecules include natural andsynthetic polymers, including proteins, polysaccharides, glycoproteins,hormones, receptors and cell surfaces, and small molecules such asdrugs, metabolites, cofactors, transition state analogs and toxins.

BACKGROUND OF THE INVENTION

[0004] Most proteins or small molecules are not known to specificallybind to nucleic acids. The known protein exceptions are those regulatoryproteins such as repressors, polymerases, activators and the like whichfunction in a living cell to bring about the transfer of geneticinformation encoded in the nucleic acids into cellular structures andthe replication of the genetic material. Furthermore, small moleculessuch as GTP bind to some intron RNAs.

[0005] Living matter has evolved to limit the function of nucleic acidsto a largely informational role. The Central Dogma, as postulated byCrick, both originally and in expanded form, proposes that nucleic acids(either RNA or DNA) can serve as templates for the synthesis of othernucleic acids through replicative processes that “read” the informationin a template nucleic acid and thus yield complementary nucleic acids.All of the experimental paradigms for genetics and gene expressiondepend on these properties of nucleic acids: in essence, double-strandednucleic acids are informationally redundant because of the chemicalconcept of base pairs and because replicative processes are able to usethat base pairing in a relatively error-free manner.

[0006] The individual components of proteins, the twenty natural aminoacids, possess sufficient chemical differences and activities to providean enormous breadth of activities for both binding and catalysis.Nucleic acids, however, are thought to have narrower chemicalpossibilities than proteins, but to have an informational role thatallows genetic information to be passed from virus to virus, cell tocell, and organism to organism. In this context nucleic acid components,the nucleotides, must possess only pairs of surfaces that allowinformational redundancy within a Watson-Crick base pair. Nucleic acidcomponents need not possess chemical differences and activitiessufficient for either a wide range of binding or catalysis.

[0007] However, some nucleic acids found in nature do participate inbinding to certain target molecules and even a few instances ofcatalysis have been reported. The range of activities of this kind isnarrow compared to proteins and more specifically antibodies. Forexample, where nucleic acids are known to bind to some protein targetswith high affinity and specificity, the binding depends on the exactsequences of nucleotides that comprise the DNA or RNA ligand. Thus,short double-stranded DNA sequences are known to bind to target proteinsthat repress or activate transcription in both prokaryotes andeukaryotes. Other short double-stranded DNA sequences are known to bindto restriction endonucleases, protein targets that can be selected withhigh affinity and specificity. Other short DNA sequences serve ascentromeres and telomeres on chromosomes, presumably by creating ligandsfor the binding of specific proteins that participate in chromosomemechanics. Thus, double-stranded DNA has a well-known capacity to bindwithin the nooks and crannies of target proteins whose functions aredirected to DNA binding. Single-stranded DNA can also bind to someproteins with high affinity and specificity, although the number ofexamples is rather smaller. From the known examples of double-strandedDNA binding proteins, it has become possible to describe the bindinginteractions as involving various protein motifs projecting amino acidside chains into the major groove of B form double-stranded DNA,providing the sequence inspection that allows specificity.

[0008] Double-stranded RNA occasionally serves as a ligand for certainproteins, for example, the endonuclease RNase III from E. coli. Thereare more known instances of target proteins that bind to single-strandedRNA ligands, although in these cases the single-stranded RNA often formsa complex three-dimensional shape that includes local regions ofintramolecular double-strandedness. The amino-acyl tRNA synthetases bindtightly to tRNA molecules with high specificity. A short region withinthe genomes of RNA viruses binds tightly and with high specificity tothe viral coat proteins. A short sequence of RNA binds to thebacteriophage T4-encoded DNA polymerase, again with high affinity andspecificity. Thus, it is possible to find RNA and DNA ligands, eitherdouble- or single-stranded, serving as binding partners for specificprotein targets. Most known DNA binding proteins bind specifically todouble-stranded DNA, while most RNA binding proteins recognizesingle-stranded RNA. This statistical bias in the literature no doubtreflects the present biosphere's statistical predisposition to use DNAas a double-stranded genome and RNA as a single-stranded entity in themany roles RNA plays beyond serving as a genome. Chemically there is nostrong reason to dismiss single-stranded DNA as a fully able partner forspecific protein interactions.

[0009] RNA and DNA have also been found to bind to smaller targetmolecules. Double-stranded DNA binds to various antibiotics, such asactinomycin D. A specific single-stranded RNA binds to the antibioticthiostreptone; specific RNA sequences and structures probably bind tocertain other antibiotics, especially those whose functions is toinactivate ribosomes in a target organism. A family of evolutionarilyrelated RNAs binds with specificity and decent affinity to nucleotidesand nucleosides (Bass, B. and Cech, T. (1984) Nature 308:820-826) aswell as to one of the twenty amino acids (Yarus, M. (1988) Science240:1751-1758). Catalytic RNAs are now known as well, although thesemolecules perform over a narrow range of chemical possibilities, whichare thus far related largely to phosphodiester transfer reactions andhydrolysis of nucleic acids.

[0010] Despite these known instances, the great majority of proteins andother cellular components are thought not to bind to nucleic acids underphysiological conditions and such binding as may be observed isnon-specific. Either the capacity of nucleic acids to bind othercompounds is limited to the relatively few instances enumerated supra,or the chemical repertoire of the nucleic acids for specific binding isavoided (selected against) in the structures that occur naturally. Thepresent invention is premised on the inventors' fundamental insight thatnucleic acids as chemical compounds can form a virtually limitless arrayof shapes, sizes and configurations, and are capable of a far broaderrepertoire of binding and catalytic functions than those displayed inbiological systems.

[0011] The chemical interactions have been explored in cases of certainknown instances of protein-nucleic acid binding. For example, the sizeand sequence of the RNA site of bacteriophage R17 coat protein bindinghas been identified by Uhlenbeck and coworkers. The minimal natural RNAbinding site (21 bases long) for the R17 coat protein was determined bysubjecting variable-sized labeled fragments of the mRNA tonitrocellulose filter binding assays in which protein-RNA fragmentcomplexes remain bound to the filter (Carey et al. (1983) Biochemistry22:2601). A number of sequence variants of the minimal R17 coat proteinbinding site were created in vitro in order to determine thecontributions of individual nucleic acids to protein binding (Uhlenbecket al. (1983) J. Biomol. Structure Dynamics 1:539 and Romaniuk et al.(1987) Biochemistry 26:1563). It was found that the maintenance of thehairpin loop structure of the binding site was essential for proteinbinding but, in addition, that nucleotide substitutions at most of thesingle-stranded residues in the binding site, including a bulgednucleotide in the hairpin stem, significantly affected binding. Insimilar studies, the binding of bacteriophage Qβ coat protein to itstranslational operator was examined (Witherell and Uhlenbeck (1989)Biochemistry 28:71). The Qβ coat protein RNA binding site was found tobe similar to that of R17 in size, and in predicted secondary structure,in that it comprised about 20 bases with an 8 base pair hairpinstructure which included a bulged nucleotide and a 3 base loop. Incontrast to the R17 coat protein binding site, only one of thesingle-stranded residues of the loop is essential for binding and thepresence of the bulged nucleotide is not required. The protein-RNAbinding interactions involved in translational regulation displaysignificant specificity.

[0012] Nucleic acids are known to form secondary and tertiary structuresin solution. The double-stranded forms of DNA include the so-called Bdouble-helical form, Z-DNA and superhelical twists (Rich, A. et al.(1984) Ann. Rev. Biochem. 53:791-846). Single-stranded RNA formslocalized regions of secondary structure such as hairpin loops andpseudoknot structures (Schimmel, P. (1989) Cell 58:9-12). However,little is known concerning the effects of unpaired loop nucleotides onstability of loop structure, kinetics of formation and denaturation,thermodynamics, and almost nothing is known of tertiary structures andthree dimensional shape, nor of the kinetics and thermodynamics oftertiary folding in nucleic acids (Tuerk, C. et al. (1988) Proc. Natl.Acad. Sci. USA 85:1364-1368).

[0013] A type of in vitro evolution was reported in replication of theRNA bacteriophage Qβ. Mills, D. R. et al. (1967) Proc. Natl. Acad. SciUSA 58:217-224; Levisohn, R. and Spiegelman, S. (1968) Proc. Natl. Acad.Sci. USA 60:866-872; Levisohn, R. and Spiegelman S. (1969) Proc. Natl.Acad. Sci. USA 63:805-811; Saffhill, R. et al. (1970) J. Mol. Biol.51:531-539; Kacian, D. L. et al. (1972) Proc. Natl. Acad. Sci. USA69:3038-3042; Mills, D. R. et al. (1973) Science 180:916-927. The phageRNA serves as a poly-cistronic messenger RNA directing translation ofphage-specific proteins and also as a template for its own replicationcatalyzed by Qβ RNA replicase. This RNA replicase was shown to be highlyspecific for its own RNA templates. During the course of cycles ofreplication in vitro small variant RNAs were isolated which were alsoreplicated by Qβ replicase. Minor alterations in the conditions underwhich cycles of replication were performed were found to result in theaccumulation of different RNAs, presumably because their replication wasfavored under the altered conditions. In these experiments, the selectedRNA had to be bound efficiently by the replicase to initiate replicationand had to serve as a kinetically favored template during elongation ofRNA. Kramer et al. (1974) J. Mol. Biol. 89:719 reported the isolation ofa mutant RNA template of Qβ replicase, the replication of which was moreresistant to inhibition by ethidium bromide than the natural template.It was suggested that this mutant was not present in the initial RNApopulation but was generated by sequential mutation during cycles of invitro replication with Qβ replicase. The only source of variation duringselection was the intrinsic error rate during elongation by Qβreplicase. In these studies what was termed “selection” occurred bypreferential amplification of one or more of a limited number ofspontaneous variants of an initially homogenous RNA sequence. There wasno selection of a desired result, only that which was intrinsic to themode of action of Qβ replicase.

[0014] Joyce and Robertson (Joyce (1989) in RNA: Catalysis, Splicing,Evolution, Belfort and Shub (eds.), Elsevier, Amsterdam pp. 83-87; andRobertson and Joyce (1990) Nature 344:467) reported a method foridentifying RNAs which specifically cleave single-stranded DNA. Theselection for catalytic activity was based on the ability of theribozyme to catalyze the cleavage of a substrate ssRNA or DNA at aspecific position and transfer the 3′-end of the substrate to the 3′-endof the ribozyme. The product of the desired reaction was selected byusing an oligodeoxynucleotide primer which could bind only to thecompleted product across the junction formed by the catalytic reactionand allowed selective reverse transcription of the ribozyme sequence.The selected catalytic sequences were amplified by attachment of thepromoter of T7 RNA polymerase to the 3′-end of the cDNA, followed bytranscription to RNA. The method was employed to identify from a smallnumber of ribozyme variants the variant that was most reactive forcleavage of a selected substrate. Only a limited array of variants wastestable, since variation depended upon single nucleotide changesoccurring during amplification.

[0015] The prior art has not taught or suggested more than a limitedrange of chemical functions for nucleic acids in their interactions withother substances: as targets for protein ligands evolved to bind certainspecific olignocleotide sequences; more recently, as catalysts with alimited range of activities. Prior “selection” experiments have beenlimited to a narrow range of variants of a previously describedfunction. Now, for the first time, it will be understood that thenucleic acids are capable of a vastly broad range of functions and themethodology for realizing that capability is disclosed herein.

SUMMARY OF THE INVENTION

[0016] The present invention provides a class of products which arenucleic acid molecules, each having a unique sequence, each of which hasthe property of binding specifically to a desired target compound ormolecule. Each compound of the invention is a specific ligand of a giventarget molecule. The invention is based on the unique insight thatnucleic acids have sufficient capacity for forming a variety of two- andthree-dimensional structures and sufficient chemical versatilityavailable within their monomers to act as ligands (form specific bindingpairs) with virtually any chemical compound, whether monomeric orpolymeric. Molecules of any size can serve as targets. Most commonly,and preferably, for therapeutic applications, binding takes place inaqueous solution at conditions of salt, temperature and pH nearacceptable physiological limits.

[0017] The invention also provides a method which is generallyapplicable to make a nucleic acid ligand for any desired target. Themethod involves selection from a mixture of candidates and step-wiseiterations of structural improvement, using the same general selectiontheme, to achieve virtually any desired criterion of binding affinityand selectivity. Starting from a mixture of nucleic acids, preferablycomprising a segment of randomized sequence, the method, termed SELEXherein, includes steps of contacting the mixture with the target underconditions favorable for binding, partitioning unbound nucleic acidsfrom those nucleic acids which have bound to target molecules,dissociating the nucleic acid-target pairs, amplifying the nucleic acidsdissociated from the nucleic acid-target pairs to yield aligand-enriched mixture of nucleic acids, then reiterating the steps ofbinding, partitioning, dissociating and amplifying through as manycycles as desired.

[0018] While not bound by a theory of preparation, SELEX is based on theinventors' insight that within a nucleic acid mixture containing a largenumber of possible sequences and structures there is a wide range ofbinding affinities for a given target. A nucleic acid mixturecomprising, for example a 20 nucleotide randomized segment can have 4²⁰candidate possibilities. Those which have the higher affinity constantsfor the target are most likely to bind. After partitioning, dissociationand amplification, a second nucleic acid mixture is generated, enrichedfor the higher binding affinity candidates. Additional rounds ofselection progressively favor the best ligands until the resultingnucleic acid mixture is predominantly composed of only one or a fewsequences. These can then be cloned, sequenced and individually testedfor binding affinity as pure ligands.

[0019] Cycles of selection and amplification are repeated until adesired goal is achieved. In the most general case,selection/amplification is continued until no significant improvement inbinding strength is achieved on repetition of the cycle. The iterativeselection/amplification method is sensitive enough to allow isolation ofa single sequence variant in a mixture containing at least 65,000sequence variants. The method is even capable of isolating a smallnumber of high affinity sequences in a mixture containing 10¹⁴sequences. The method could, in principle, be used to sample as many asabout 10¹⁸ different nucleic acid species. The nucleic acids of the testmixture preferably include a randomized sequence portion as well asconserved sequences necessary for efficient amplification. Nucleic acidsequence variants can be produced in a number of ways includingsynthesis of randomized nucleic acid sequences and size selection fromrandomly cleaved cellular nucleic acids. The variable sequence portionmay contain a fully or partially random sequence; it may also containsubportions of conserved sequence incorporated with randomized sequence.Sequence variation in test nucleic acids can be introduced or increasedby mutagenesis before or during the selection/amplification iterations.

[0020] In one embodiment of the present invention, the selection processis so efficient at isolating those nucleic acid ligands that bind moststrongly to the selected target, that only one cycle of selection andamplification is required. Such an efficient selection may occur, forexample, in a chromatographic-type process wherein the ability ofnucleic acids to associate with targets bound on a column operates insuch a manner that the column is sufficiently able to allow separationand isolation of the highest affinity nucleic acid ligands.

[0021] In many cases, it is not necessarily desirable to perform theiterative steps of SELEX until a single nucleic acid ligand isidentified. The target-specific nucleic acid ligand solution may includea family of nucleic acid structures or motifs that have a number ofconserved sequences and a number of sequences which can be substitutedor added without significantly affecting the affinity of the nucleicacid ligands to the target. By terminating the SELEX process prior tocompletion, it is possible to determine the sequence of a number ofmembers of the nucleic acid ligand solution family, which will allow thedetermination of a comprehensive description of the nucleic acid ligandsolution.

[0022] After a description of the nucleic acid ligand family has beenresolved by SELEX, in certain cases it may be desirable to perform afurther series of SELEX that is tailored by the information receivedduring the SELEX experiment. In one embodiment, the second series ofSELEX will fix those conserved regions of the nucleic acid ligand familywhile randomizing all other positions in the ligand structure. In analternate embodiment, the sequence of the most representative member ofthe nucleic acid ligand family may be used as the basis of a SELEXprocess wherein the original pool of nucleic acid sequences is notcompletely randomized but contains biases towards the best known ligand.By these methods it is possible to optimize the SELEX process to arriveat the most preferred nucleic acid ligands.

[0023] A variety of nucleic acid primary, secondary and tertiarystructures are known to exist. The structures or motifs that have beenshown most commonly to be involved in non-Watson-Crick type interactionsare referred to as hairpin loops, symmetric and asymmetric bulges,psuedoknots and myriad combinations of the same. Almost all known casesof such motifs suggest that they can be formed in a nucleic acidsequence of no more than 30 nucleotides. For this reason, it ispreferred that SELEX procedures with contiguous randomized segments beinitiated with nucleic acid sequences containing a randomized segment ofbetween about 20-50 nucleotides, and in the most preferred embodimentsbetween 25 and 40 nucleotides. This invention includes solutionscomprising a mixture of between about 10⁹ to 10¹⁸ nucleic acid sequenceshaving a contiguous randomized sequence of at least about 15 nucleotidesin length. In the preferred embodiment, the randomized section ofsequences is flanked by fixed sequences that facilitate theamplification of the ligands.

[0024] In the case of a polymeric target, such as a protein, the ligandaffinity can be increased by applying SELEX to a mixture of candidatescomprising a first selected sequence and second randomized sequence. Thesequence of the first selected ligand associated with binding orsubportions thereof can be introduced into the randomized portion of thenucleic acids of a second test mixture. The SELEX procedure is repeatedwith the second test mixture to isolate a second nucleic acid ligand,having two sequences selected for binding to the target, which hasincreased binding strength or increased specificity of binding comparedto the first nucleic acid ligand isolated. The sequence of the secondnucleic acid ligand associated with binding to the target can then beintroduced into the variable portion of the nucleic acids of a thirdtest mixture which, after cycles of SELEX results in a third nucleicacid ligand. These procedures can be repeated until a nucleic acidligand of a desired binding strength or a desired specificity of bindingto the target molecule is achieved. The process of iterative selectionand combination of nucleic acid sequence elements that bind to aselected target molecule is herein designated “walking,” a term whichimplies the optimized binding to other accessible areas of amacromolecular target surface or cleft, starting from a first bindingdomain. Increasing the area of binding contact between ligand and targetcan increase the affinity constant of the binding reaction. Thesewalking procedures are particularly useful for the isolation of nucleicacid antibodies which are highly specific for binding to a particulartarget molecule.

[0025] A variant of the walking procedure employs a non-nucleic acidligand termed “anchor” which binds to the target molecule as a firstbinding domain. (See FIG. 9.) This anchor molecule can in principle beany non-nucleic acid molecule that binds to the target molecule andwhich can be covalently linked directly or indirectly to a nucleic acid.When the target molecule is an enzyme, for example, the anchor moleculecan be an inhibitor or substrate of that enzyme. The anchor can also bean antibody or antibody fragment specific for the target. The anchormolecule is covalently linked to a nucleic acid oligomer of knownsequence to produce a bridging molecule. The oligomer is preferablycomprised of a minimum of about 3-10 bases. A test mixture of candidatenucleic acids is then prepared which includes a randomized portion and asequence complementary to the known sequence of the bridging molecule.The bridging molecule is complexed to the target molecule. SELEX is thenapplied to select nucleic acids which bind to the complex of thebridging molecule and the target molecule. Nucleic acid ligands whichbind to the complex are isolated. Walking procedures as described abovecan then be applied to obtain nucleic acid ligands with increasedbinding strength or increased specificity of binding to the complex.Walking procedures could employ selections for binding to the complex orthe target itself. This method is particularly useful to isolate nucleicacid ligands which bind at a particular site within the target molecule.The complementary sequence in the test mixture acts to ensure theisolation of nucleic acid sequences which bind to the target molecule ator near the binding site of the bridging molecule. If the bridgingmolecule is derived from an inhibitor of the target molecule, thismethod is likely to result in a nucleic acid ligand which inhibits thefunction of the target molecule. It is particularly useful, for example,for the isolation of nucleic acids which will activate or inhibitprotein function. The combination of ligand and target can have a new orenhanced function.

[0026] The nucleic acid ligands of the present invention may contain aplurality of ligand components. As described above, nucleic acid ligandsderived by walking procedures may be considered as having more than onenucleic acid ligand component. This invention also includes nucleic acidantibodies that are constructed based on the results obtained by SELEXwhile not being identical to a nucleic acid ligand identified by SELEX.For example, a nucleic acid antibody may be constructed wherein aplurality of identical ligand structures are made part of a singlenucleic acid. In another embodiment, SELEX may identify more than onefamily of nucleic acid ligands to a given target. In such case, a singlenucleic acid antibody may be constructed containing a plurality ofdifferent ligand structures. SELEX experiments also may be performedwherein fixed identical or different ligand structures are joined byrandom nucleotide regions and/or regions of varying distance between thefixed ligand structures to identify the best nucleic acid antibodies.

[0027] Screens, selections or assays to assess the effect of binding ofa nucleic acid ligand on the function of the target molecule can bereadily combined with the SELEX methods. Specifically, screens forinhibition or activation of enzyme activity can be combined with theSELEX methods.

[0028] In more specific embodiments, the SELEX method provides a rapidmeans for isolating and identifying nucleic acid ligands which bind toproteins, including both nucleic acid-binding proteins and proteins notknown to bind nucleic acids as part of their biological function.Nucleic acid-binding proteins include among many others polymerases andreverse transcriptases. The methods can also be readily applied toproteins which bind nucleotides, nucleosides, nucleotide co-factors andstructurally related molecules.

[0029] In another aspect, the present invention provides a method fordetecting the presence or absence of, and/or measuring the amount of atarget molecule in a sample, which method employs a nucleic acid ligandwhich can be isolated by the methods described herein. Detection of thetarget molecule is mediated by its binding to a nucleic acid ligandspecific for that target molecule. The nucleic acid ligand can belabeled, for example radiolabled, to allow qualitative or quantitativedetection. The detection method is particularly useful for targetmolecules which are proteins. The method is more particularly useful fordetection of proteins which are not known to bind nucleic acids as partof their biological function. Thus, nucleic acid ligands of the presentinvention can be employed in diagnostics in a manner similar toconventional antibody-based diagnostics. One advantage of nucleic acidligands over conventional antibodies in such detection method anddiagnostics is that nucleic acids are capable of being readily amplifiedin vitro, for example, by use of PCR amplification or related methods.Another advantage is that the entire SELEX process is carried out invitro and does not require immunizing test animals. Furthermore, thebinding affinity of nucleic acid ligands can be tailored to the user'sneeds.

[0030] Nucleic acid ligands of small molecule targets are useful asdiagnostic assay reagents and have therapeutic uses as sequesteringagents, drug delivery vehicles and modifiers of hormone action.Catalytic nucleic acids are selectable products of this invention. Forexample, by selecting for binding to transition state analogs of anenzyme catalyzed reaction, catalytic nucleic acids can be selected.

[0031] In yet another aspect, the present invention provides a methodfor modifying the function of a target molecule using nucleic acidligands which can be isolated by SELEX. Nucleic acid ligands which bindto a target molecule are screened to select those which specificallymodify function of the target molecule, for example to select inhibitorsor activators of the function of the target molecule. An amount of theselected nucleic acid ligand which is effective for modifying thefunction of the target is combined with the target molecule to achievethe desired functional modification. This method is particularlyapplicable to target molecules which are proteins. A particularly usefulapplication of this method is to inhibit protein function, for exampleto inhibit receptor binding to an effector or to inhibit enzymecatalysis. In this case, an amount of the selected nucleic acid moleculewhich is effective for target protein inhibition is combined with thetarget protein to achieve the desired inhibition.

BRIEF DESCRIPTION OF THE FIGURES

[0032]FIG. 1 is a diagram of the ribonucleotide sequence of a portion ofthe gene 43 messenger RNA which encodes the bacteriophage T4 DNApolymerase. Shown is the sequence in the region known to bind to gp43.The bold-faced capitalized letters indicate the extent of theinformation required for binding of gp43. The eight base-pair loop wasreplaced by randomized sequence to yield a candidate population forSELEX.

[0033]FIG. 2 is a schematic diagram of the SELEX process as exemplifiedfor selecting loop sequence variants for RNAs that bind to T4 DNApolymerase (gp43). A DNA template for preparation of a test mixture ofRNAs was prepared as indicated in step a by ligation of oligomers 3, 4and 5, whose sequences are given in Table 1 infra. Proper ligation instep a was assured by hybridization with oligomers 1 and 2, which havecomplementary sequence (given in Table 1) that bridges oligomers 3 and 4and 4 and 5, respectively. The resultant 110-base long template wasgel-purified, annealed to oligo 1 and was used in vitro transcriptionreactions (Miligan et al. (1987) Nucl. Acids Res. 15:8783-8798) toproduce an initial RNA mixture containing randomized sequences of the8-base loop, step b. The resultant transcripts were gel-purified andsubjected to selection on nitrocellulose filters for binding to gp43(step c), as described in Example 1. Selected RNAs were amplified in athree step process: (d) cDNA copies of the selected RNAs were made byreverse transcriptase synthesis using oligo 5 (Table 1) as a primer; (e)cDNAs were amplified using Taq DNA polymerase chain extension of oligo 1(Table 1), which carries essential T7 promoter sequences, and oligo 5(Table 1) as described in Innis et al. (1988) Proc. Natl. Acad. Sci. USA85:9436; and (f) double-stranded DNA products of amplification weretranscribed in vitro. The resultant selected amplified RNAs were used inthe next round of selection.

[0034]FIG. 3 is a composite of autoradiographs of electrophoresed batchsequencing reactions of the in vitro transcripts derived from SELEX forbinding of RNA loop variants to gp43. The figure indicates the change inloop sequence components as a function of number of selection cycles(for 2, 3 and 4 cycles) for selection conditions of experiment B inwhich the concentration of gp43 was 3×10⁻⁸ M and the concentration ofRNA was about 3×10⁻⁵ M in all selection cycles. Sequencing was performedas described in Gauss et al. (1987) Mol. Gen. Genet. 206:24-34.

[0035]FIG. 4 is a composite of autoradiographs of batch RNA sequences ofthose RNAs selected from the fourth round of SELEX amplification forbinding of RNA loop variants to gp43 employing different bindingconditions. In experiment A gp43 concentration was 3×10⁻⁸ M and RNAconcentration was about 3×10⁻⁷ M. In experiment B, gp43 was 3×10⁻⁸ M andRNA was about 3×10⁻⁵ M. In experiment C, gp43 was 3×10⁻⁷ M and RNA wasabout 3×10⁻⁵ M.

[0036]FIG. 5 is a composite of autoradiographs of three sequencing gelsfor loop variants selected for binding to gp43 under the selectionconditions of experiment B (see Example 1). The left hand sequence gelis the batch sequencing of selected RNAs after the fourth round ofselection/amplification. The middle and right hand sequence gels aredouble-stranded DNA sequencing gels of two clonal isolates derived formthe batch RNAs. The batch of RNA selected is composed of two majorvariants, one of which was the wild-type sequence (middle sequence gel),and a novel sequence (right hand gel).

[0037]FIG. 6 is a graph of percent RNA bound to gp43 as a function ofgp43 concentration for different selected RNA loop sequence variants andfor RNA with a randomized loop sequence. Binding of the wild-type loopsequence AAUAACUC is indicated as open circles, solid line; majorvariant loop sequence AGCAACCU as “x,” dotted line; minor variant loopsequence AAUAACUU as open squares, solid line; minor variant loopsequence AAUGACUC as solid circles, dotted line; minor variant loopsequence AGCGACCU as crosses, dotted line; and binding of the randomizedmixture (NNNNNN) of loop sequences as open circles, dotted line.

[0038]FIG. 7 is a pictorial summary of results achieved after fourrounds of SELEX to select a novel gp43 binding RNA from a candidatepopulation randomized in the eight base-pair loop. SELEX did not yieldthe “apparent” consensus expected from the batch sequences shown in FIG.4, but instead yielded wild type and a single major variant in aboutequal proportions and three single mutants. The frequencies of eachspecies out of twenty cloned isolates are shown together with theapproximate affinity constants (Kd) for each, as derived from filterbinding assays shown in FIG. 6.

[0039]FIG. 8 is a series of diagrams showing synthesis of candidatenucleic acid ligands using the enzymes terminal transferase (TDT) andDNA polymerase (DNA pol). A 5′ primer or primary ligand sequence isprovided with a tail of randomized sequence by incubating with terminaltransferase in the presence of the four deoxynucleotide triphosphates(dNTPs). Homopolymer tailing of the randomized segment, using the sameenzyme in the presence of a single deoxynucleotide triphosphate (e.g.dCTP) provides an annealing site for poly-G tailed 3′ primer. Afterannealing, the double-stranded molecule is completed by the action ofDNA polymerase. The mixture can be further amplified, if desired, by thepolymerase chain reaction.

[0040]FIG. 9 is a diagram showing a process using SELEX to select alarge nucleic acid ligand having two spatially separate bindinginteractions with a target protein. The process is termed “walking”since it includes two stages, the second being an extension of thefirst. The upper part of the figure depicts a target (“protein ofinterest”) with a bound nucleic acid ligand selected by a first round ofSELEX (“evolved primary ligand”) bound to the protein at a first bindingsite. A reaction catalyzed by terminal transferase extends the length ofthe evolved primary ligand and generates a new set of randomizedsequence candidates having a conserved region containing the primaryligand. The lower part of the figure depicts the result of a secondround of SELEX based upon improved binding that results from thesecondary ligand interaction at the secondary binding site of theprotein. The terms “primary” and “secondary” are merely operative termsthat do not imply that one has higher affinity than the other.

[0041]FIGS. 10 and 11 are diagrams of a selection process using SELEX intwo stages. In FIG. 10, SELEX is applied to select ligands that bind tosecondary binding sites on a target complexed with a bridgingoligonucleotide connected to a specific binder, e.g., inhibitor of thetarget protein. The bridging oligonucleotide acts as a guide to favorselection of ligands that bind to accessible secondary binding sites. InFIG. 11, a second SELEX is applied to evolve ligands that bind at boththe secondary sites originally selected for and the primary targetdomain. The nucleic acids thereby evolved will bind very tightly, andmay themselves act as inhibitors of the target protein or to competeagainst inhibitors or substrates of the target protein.

[0042]FIGS. 12A and B show the sequence and placement of oligomers usedto construct the candidate mixture used in Example 2. The top line showsthe sequences of oligomers 1b and 2b from left to right, respectively(see Table 2 infra). The second line shown, from left to right, thesequences of oligomers 3b, 4b and 5b (Table 2). Proper ligation of theoligomers was assured by hybridization with oligomers 1b and 2b, whosesequences are complementary. The resultant ligated template wasgel-purified, annealed to oligomer 1b and used in an in vitrotranscription reaction (Milligan et al. (1987)) to produce an RNAcandidate mixture, shown in the last line of the figure, labeled “invitro transcript.” The candidate mixture contained a 32 nucleotiderandomized segment, as shown.

[0043]FIG. 13 shows a hypothetical RNA sequence containing a variety ofsecondary structures that RNA are known to undertake. Included are: Ahairpin loops, B bulges, C asymmetric bulges, and D pseudoknots.

[0044]FIG. 14 shows nitrocellulose filter binding assays of ligandaffinity for HIV-RT. Shown is the percent of input RNA that is bound tothe nitrocellulose filter with varying concentrations of HIV-RT.

[0045]FIG. 15 shows additional nitrocellulose filter binding assays ofligand affinity for HIV-RT.

[0046]FIG. 16 shows information boundary determination for HIV-1 RTligands 1.1 (FIG. 16A) and 1.3a (FIG. 16B) 3′ boundary determination.RNAs were 5′ end labeled, subjected to partial alkaline hydrolysis andselection on nitrocellulose filters, separated on a denaturing 8%polyacrylamide gel and autoradiographed. Approximately 90 picomoles oflabeled RNA and 80 picomoles of HIV-1 RT were mixed in 0.5, 2.5, and 5mls of buffer and incubated for 5 minutes at 37° C. prior to washingthrough a nitrocellulose filter. The eluated RNAs are shown under thefinal concentrations of HIV-1 RT used in each experiment. Also shown arethe products of a partial RNase TI digest which allows identification ofthe information boundary on the adjacent sequence as shown by arrows(FIG. 16C) 5′ boundary determination. The 5′ boundary was determined ina) under the same conditions listed above.

[0047]FIG. 17 shows the inhibition of HIV-1 RT by RNA ligand 1.1. Aseries of three-fold dilutions of 32N candidate mixture RNA and ligand1.1 RNA ranging in final reaction concentration for 10 micro molar to4.6 nanomolar and pre-mixed with HIV-RT and incubated for 5 minutes at37° C. in 6 μL of 200 mM KOAc, 50 mM Tris-HCl, pH 7.7, 10 mMdithiothreitol, 6 mM Mg (OAc)₂, and 0.4 mM NTPS. In a separate tube RNAtemplate (transcribed from a PCR product of a T7-1 obtained from U.S.Biochemical Corp. using oligos 7 and 9) and labeled oligo 9 were mixedand heated at 95° C. for one minute and cooled on ice for 15 minutes in10 mM Tris-HCl, pH 7, 0.1 mM EDTA. Four μl of this template was added toeach 6 μl enzyme-inhibitor mixture to start the reaction which wasincubated for a further 5 minutes at 37° C. and then stopped. The finalconcentration of HIV-1 RT was 16 nanomolar, of RNA template was 13nanomolar, and of labeled primer was 150 nanomolar in all reactions. Theextension products of each reaction are shown.

[0048]FIG. 18 shows comparisons of HIV-1 RT inhibition by ligand 1.1 toeffects on MMLV RT and AMV RT. Experiments were performed as in FIG. 17except that 5-fold dilutions of inhibitor were prepared with theresultant concentrations as shown. The concentrations of each RT werenormalized to that of HIV-RT by dilutions and comparison of gel bandintensity with both Coomassie blue and silver stains, Biorad proteinconcentration assays, and activity assays.

[0049]FIG. 19 shows the consensus sequences of selected hairpinsrepresenting the R-17 coat protein ligand solution. The nucleotiderepresentation at each position is indicated in grids. The column headed“bulge” represents the number of clones with an extra-helical nucleotideon one or both sides of the stem between the corresponding stembase-pairs. The column headed “end” represents the number of cloneswhose hairpin terminated at the previous base-pair.

[0050]FIG. 20 shows a binding curve of 30N bulk RNA for bradykinin.Anaylsis was done using spin columns; 10 mM KOAc, 10 mM DEM, pH 7.5; RNAconcentration 1.5×10⁻⁸M.

[0051]FIG. 21 shows templates for use in the generation of candidatemixtures that are enriched in certain structural motifs. Template A isdesigned to enrich the candidate mixture in hairpin loops. Template B isdesigned to enrich the candidate mixture in pseudoknots.

[0052]FIG. 22 is a schematic diagram of stem-loop arrangements forMotifs I and II of the HIV-rev ligand solution. The dotted lines instems 1 and 2 between loops 1 and 3 indicate potential base-pairs.

[0053]FIG. 23 shows the folded secondary structures of rev ligandsubdomains of isolates 6a, 1a, and 8 to show motifs I, II and IIIrespectively. Also shown for comparison is the predicted fold of thewild type RRE RNA.

[0054]FIG. 24 is a graph of percent of input counts bound to anitrocellulose filter with various concentrations of HIV rev protein.Also shown are the binding curves of the 32N starting population (#) andof the evolved population after 10 rounds (P) and of the wild type RREsequence transcribed from a template composed of oligos 8 and 9(W).

[0055]FIG. 25 is a comparison of Motif I(a) rev ligands. Parameters areas in FIG. 24. Also included is the binding curve of the “consensus”construct (C).

[0056]FIG. 26 is a comparison of Motif I(b) rev ligands. Parameters areas in FIG. 24.

[0057]FIG. 27 is a comparison of Motif II rev ligands. Parameters are asin FIG. 24.

[0058]FIG. 28 is a comparison of Motif III rev ligands. Parameters as inFIG. 24.

[0059]FIG. 29 shows the consensus nucleic acid ligand solution to HIVrev referred to as Motif I.

[0060]FIG. 30 shows the consensus nucleic acid ligand solution to HIVrev referred to as Motif II.

[0061]FIG. 31 is a schematic representation of a pseudoknot. Thepseudoknot consists of two stems and three loops, referred to herein asstems S₁ and S₂ and loops 1, 2 and 3.

DETAILED DESCRIPTION OF THE INVENTION

[0062] The following terms are used herein according to the definitions.

[0063] Nucleic acid means either DNA, RNA, single-stranded ordouble-stranded and any chemical modifications thereof, provided onlythat the modification does not interfere with amplification of selectednucleic acids. Such modifications include, but are not limited to,modifications at cytosine exocyclic amines, substitution of5-bromo-uracil, backbone modifications, methylations, unusualbase-pairing combinations and the like.

[0064] Ligand means a nucleic acid that binds another molecule (target).In a population of candidate nucleic acids, a ligand is one which bindswith greater affinity than that of the bulk population. In a candidatemixture there can exist more than one ligand for a given target. Theligands can differ from one another in their binding affinities for thetarget molecule.

[0065] Candidate mixture is a mixture of nucleic acids of differingsequence, from which to select a desired ligand. The source of acandidate mixture can be from naturally-occurring nucleic acids orfragments thereof, chemically synthesized nucleic acids, enzymicallysynthesized nucleic acids or nucleic acids made by a combination of theforegoing techniques.

[0066] Target molecule means any compound of interest for which a ligandis desired. A target molecule can be a protein, peptide, carbohydrate,polysaccharide, glycoprotein, hormone, receptor, antigen, antibody,virus, substrate, metabolite, transition state analog, cofactor,inhibitor, drug, dye, nutrient, growth factor, etc., without limitation.

[0067] Partitioning means any process whereby ligands bound to targetmolecules, termed ligand-target pairs herein, can be separated fromnucleic acids not bound to target molecules. Partitioning can beaccomplished by various methods known in the art. Nucleic acid-proteinpairs can be bound to nitrocellulose filters while unbound nucleic acidsare not. Columns which specifically retain ligand-target pairs (orspecifically retain bound ligand complexed to an attached target) can beused for partitioning. Liquid-liquid partition can also be used as wellas filtration gel retardation, and density gradient centrifugation. Thechoice of partitioning method will depend on properties of the targetand of the ligand-target pairs and can be made according to principlesand properties known to those of ordinary skill in the art.

[0068] Amplifying means any process or combination of process steps thatincreases the amount or number of copies of a molecule or class ofmolecules. Amplifying RNA molecules in the disclosed examples wascarried out by a sequence of three reactions: making cDNA copies ofselected RNAs, using polymerase chain reaction to increase the copynumber of each cDNA, and transcribing the cDNA copies to obtain RNAmolecules having the same sequences as the selected RNAs. Any reactionor combination of reactions known in the art can be used as appropriate,including direct DNA replication, direct RNA amplification and the like,as will be recognized by those skilled in the art. The amplificationmethod should result in the proportions of the amplified mixture beingessentially representative of the proportions of different sequences inthe initial mixture.

[0069] Specific binding is a term which is defined on a case-by-casebasis. In the context of a given interaction between a given ligand anda given target, a binding interaction of ligand and target of higheraffinity than that measured between the target and the candidate ligandmixture is observed. In order to compare binding affinities, theconditions of both binding reactions must be the same, and should becomparable to the conditions of the intended use. For the most accuratecomparisons, measurements will be made that reflect the interactionbetween ligand as a whole and target as a whole. The nucleic acidligands of the invention can be selected to be as specific as required,either by establishing selection conditions that demand the requisitespecificity during SELEX, or by tailoring and modifying the ligandthrough “walking” and other modifications using interactions of SELEX.

[0070] Randomized is a term used to describe a segment of a nucleic acidhaving, in principle any possible sequence over a given length.Randomized sequences will be of various lengths, as desired, rangingfrom about eight to more than 100 nucleotides. The chemical or enzymaticreactions by which random sequence segments are made may not yieldmathematically random sequences due to unknown biases or nucleotidepreferences that may exist. The term “randomized” is used instead of“random” to reflect the possibility of such deviations fromnon-ideality. In the techniques presently known, for example sequentialchemical synthesis, large deviations are not known to occur. For shortsegments of 20 nucleotides or less, any minor bias that might existwould have negligible consequences. The longer the sequences of a singlesynthesis, the greater the effect of any bias.

[0071] A bias may be deliberately introduced into randomized sequence,for example, by altering the molar ratios of precursor nucleoside (ordeoxynucleoside) triphosphates of the synthesis reaction. A deliberatebias may be desired, for example, to approximate the proportions ofindividual bases in a given organism, or to affect secondary structure.

[0072] SELEXION refers to a mathematical analysis and computersimulation used to demonstrate the powerful ability of SELEX to identifynucleic acid ligands and to predict which variations in the SELEXprocess have the greatest impact on the optimization of the process.SELEXION is an acronym for Systematic Evolution of Ligands byEXponential enrichment with Integrated Optimization by Nonlinearanalysis.

[0073] Nucleic acid antibodies is a term used to refer to a class ofnucleic acid ligands that are comprised of discrete nucleic acidstructures or motifs that selectively bind to target molecules. Nucleicacid antibodies may be made up of double- or single-stranded RNA or DNA.The nucleic acid antibodies are synthesized, and in a preferredembodiment are constructed based on a ligand solution or solutionsreceived for a given target by the SELEX process. In many cases, thenucleic acid antibodies of the present invention are not naturallyoccurring in nature, while in other situations they may have significantsimilarity to a naturally occurring nucleic acid sequence.

[0074] The nucleic acid antibodies of the present Invention include allnucleic acids having a specific binding affinity for a target, while notincluding the cases when the target is a polynucleotide which binds tothe nucleic acid through a mechanism which predominantly depends onWatson/Crick base pairing or triple helix agents (See, Riordan, M. etal. (1991) Nature 350:442-443); provided, however, that when the nucleicacid antibody is double-stranded DNA, the target is not a naturallyoccuring protein whose physiological function depends on specificbinding to double-stranded DNA.

[0075] RNA motif is a term generally used to describe the secondary ortertiary structure of RNA molecules. The primary sequence of an RNA is aspecific string of nucleotides (A, C, G or U) in one dimension. Theprimary sequence does not give information on first impression as to thethree dimensional configuration of the RNA, although it is the primarysequence that dictates the three dimensional configuration. In certaincases, the ligand solution obtained after performing SELEX on a giventarget may best be represented as a primary sequence. Althoughconformational information pertaining to such a ligand solution is notalways ascertainable based on the results obtained by SELEX, therepresentation of a ligand solution as a primary sequence shall not beinterpreted as disclaiming the existence of an integral tertiarystructure.

[0076] The secondary structure of an RNA motif is represented by contactin two dimensions between specific nucleotides. The most easilyrecognized secondary structure motifs are comprised of the Watson/Crickbasepairs A:U and C:G. Non-Watson/Crick basepairs, often of lowerstability, have been recognized, and include the pairs G:U, A:C, G:A,and U:U. (Base pairs are shown once; in RNA molecules the base pair X:Yby convention represents a sequence in which X is 5′ to Y, whereas thebase pair Y:X is also allowed.) In FIG. 13 are shown a set of secondarystructures, linked by single-stranded regions; the conventionalnomenclature for the secondary structures includes hairpin loops,asymmetric bulged hairpin loops, symmetric hairpin loops, andpseudoknots.

[0077] When nucleotides that are distant in the primary sequence and notthought to interact through Watson/Crick and non-Watson/Crick base pairsare in fact interacting, these interactions (which are often depicted intwo dimensions) are also part of the secondary structure.

[0078] The three dimensional structure of an RNA motif is merely thedescription, in space, of the atoms of the RNA motif. Double-strandedRNA, fully base paired through Watson/Crick pairing, has a regularstructure in three dimensions, although the exact positions of all theatoms of the helical backbone could depend on the exact sequence ofbases in the RNA. A vast literature is concerned with secondarystructures of RNA motifs, and those secondary structures containingWatson/Crick base pairs are thought often to form A-form double-strandedhelices.

[0079] From A-form helices one can extend toward the other motifs inthree dimensions. Non-Watson/Crick base pairs, hairpin loops, bulges,and pseudoknots are structures built within and upon helices. Theconstruction of these additional motifs is described more fully in thetext.

[0080] The actual structure of an RNA includes all the atoms of thenucleotide of the molecule in three dimensions. A fully solved structurewould include as well bound water and inorganic atoms, although suchresolution is rarely achieved by a researcher. Solved RNA structures inthree dimensions will include all the secondary structure elements(represented as three dimensional structures) and fixed positions forthe atoms of nucleotides not restrained by secondary structure elements;due to base stacking and other forces extensive single stranded domainsmay have fixed structures.

[0081] Primary sequences of RNAs limit the possible three dimensionalstructures, as do the fixed secondary structures. The three dimensionalstructures of an RNA are limited by the specified contacts between atomsin two dimensions, and are then further limited by energy minimizations,the capacity of a molecule to rotate all freely rotatable bonds suchthat the resultant molecule is more stable than other conformers havingthe same primary and secondary sequence and structure.

[0082] Most importantly, RNA molecules have structures in threedimensions that are comprised of a collection of RNA motifs, includingany number of the motifs shown in FIG. 13.

[0083] Therefore, RNA motifs include all the ways in which it ispossible to describe in general terms the most stable groups ofconformations that a nucleic acid compound can form. For a given target,the ligand solution and the nucleic acid antibody may be one of the RNAmotifs described herein or some combination of several RNA motifs.

[0084] Ligand solutions are defined as the three dimensional structureheld in common or as a family that define the conserved componentsidentified through SELEX. For example, the ligands identified for aparticular target may contain a primary sequence in common(NNNCGNAANUCGN′N′N)(SEQ ID NO:1) which can be represented by a hairpinin two dimensions by:  AAN N   U  G C  C G  N N′  N N′  N N′

[0085] The three dimensional structure would thus be insensitive to theexact sequence of three of the five base pairs and two of the five loopnucleotides, and would in all or most versions of the sequence/structurebe an appropriate ligand for further use. Thus ligand solutions aremeant to represent a potentially large collection of appropriatesequence/structures, each identified by the family description which isinclusive of all exact sequence/structure solutions. It is furthercontemplated through this definition that ligand solutions need notinclude only members with exact numerical equivalency between thevarious components of an RNA motif. Some ligands may have loops, forexample, of five nucleotides while other ligands for the same target maycontain fewer or more nucleotides in the equivalent loop and yet beincluded in the description of the ligand solution.

[0086] Although the ligand solution derived by SELEX may include arelatively large number of potential members, the ligand solutions aretarget specific and, for the most part, each member of the ligandsolution family can be used as a nucleic acid antibody to the target.The selection of a specific member from a family of ligand solutions tobe employed as a nucleic acid antibody can be made as described in thetext and may be influenced by a number of practical considerations thatwould be obvious to one of ordinary skill in the art.

[0087] The method of the present invention developed in connection withinvestigations of translational regulation in bacteriophage T4infection. Autoregulation of the synthesis of certain viral proteins,such as the bacteriophage T4 DNA polymerase (gp43), involves binding ofthe protein to its own message, blocking its translation. The SELEXmethod was used to elucidate the sequence and structure requirements ofthe gp43 RNA binding site. SELEX allowed the rapid selection ofpreferred binding sequences from a population of random nucleic acidsequences. While exemplified by the isolation and identification ofnucleic acid sequences which bind to proteins known to bind to RNA, themethod of the present invention is generally applicable to the selectionof a nucleic acid capable of binding any given protein. The method isapplicable to selection-of nucleic acids which bind to proteins which donot (or are not known to) bind to nucleic acid as a part of theirnatural activity or biological function. The SELEX method requires noknowledge of the structure or sequence of a binding site and noknowledge of the structure or sequence of the target protein. The methoddoes not depend on purified target protein for selections. In general,application of SELEX will enrich for ligands of the most abundanttarget. In a mixture of ligands, techniques for isolating the ligand ofa given target are available. For example, another ligand (e.g.,substrate, inhibitor, antibody) of the desired target can be used tocompete specifically for binding the target, so that the desired nucleicacid ligand can be partitioned from ligands of other targets.

[0088] In the preferred embodiment, ligands derived by SELEX arecomprised of single stranded RNA sequences. It is a critical element ofthis invention that the present inventors were able to make conclusionsabout RNA that are contrary to those commonly held in the field, and touse these conclusions to tailor the SELEX process to achieve nucleicacid antibodies derived from ligand solutions.

[0089] RNA was first appreciated as an information messenger between theDNA sequences that are the genes and the protein sequences that arefound within enzymes and other proteins. From the first moments afterWatson and Crick described the structure of DNA and the connectionbetween DNA sequence and protein sequence, the means by which proteinswere synthesized became central to much experimental biochemistry.Eventually messenger RNA (mRNA) was identified as the chemicalintermediate between genes and proteins. A majority of RNA speciespresent in organisms are mRNAs, and thus RNA continues to be seenlargely as an informational molecule. RNA serves its role as aninformational molecule largely through the primary sequence ofnucleotides, in the same way that DNA serves its function as thematerial of genes through the primary sequence of nucleotides; that is,information in nucleic acids can be represented in one dimension.

[0090] As the biochemistry of gene expression was studied, several RNAmolecules within cells were discovered whose roles were notinformational. Ribosomes were discovered to be the entities upon whichmRNAs are translated into proteins, and ribosomes were discovered tocontain essential RNA (ribosomal RNAs, or rRNAs). rRNAs for many yearswere considered to be structural, a sort of scaffold upon which theprotein components of the ribosome were “hung” so as to allow theprotein components of the ribosome to perform the protein syntheticaction of the ribosome. An additional large class of RNAs, the transferRNAs (tRNAs), were postulated and found. tRNAs are the chemicallybifunctional adapters that recognize codons within mRNA and carry theamino acids that are condensed into protein. Most importantly, eventhough a tRNA structure was determined by X-ray analysis in 1974, RNAswere considered to be primarily “strings” in one dimension for anadditional decade. rRNA occupied a strange position in the researchcommunity. For a long period almost no one sensed the reason behind thedeep similarities in rRNAs from various species, and the true chemicalcapacity of RNA molecules. Several researchers postulated that RNA mightonce have served an enzymatic rather than informational role, but thesepostulates were never intended to be predictive about present functionsof RNA.

[0091] Tom Cech's work on ribozymes—a new class of RNAmolecules—expanded the view of the functional capacity of RNA. The groupI introns are able to splice autocatalytically, and thus at least somelimited catalysis is within the range of RNA. Within this range ofcatalysis is the activity of the RNA component of RNase P, an activitydiscovered by Altman and Pace. Cech and Altman received the Nobel Prizein Chemistry for their work, which fundamentally changed the previouslimitations for RNA molecules to informational roles. rRNAs, because ofthe work of Cech and Altman, are now thought by some to be the catalyticcenter of the ribosome, and are no longer thought to be merelystructural.

[0092] It is a central premise of this Invention that RNA moleculesremain underestimated by the research community, with respect to bindingand other capacities. While ribozymes have caused a remarkable increasein research aimed at RNA functions, the present application contemplatesthat the shape possibilities for RNA molecules (and probably DNA aswell) afford an opportunity to use SELEX to find RNAs with virtually anybinding function. It is further contemplated that the range of catalyticfunctions possible for RNA is broad beyond the present conventionalwisdom, although not necessarily as broad as that of proteins.

[0093] The three dimensional shapes of some RNAs are known directly fromeither X-ray diffraction or NMR methodologies. The existing data set issparse. The structures of four tRNAs have been solved, as well as threesmaller RNA molecules: two small hairpins and a small pseudoknot. Thevarious tRNAs, while related, have elements of unique structure; forexample, the anticodon bases of the elongator tRNAs are displayed towardthe solvent, while the anticodon bases of an initiator tRNA are pointedmore away from the solvent. Some of these differences may result fromcrystal lattice packing forces, but some are also no doubt a result ofidiosyncratic energy minimization by different single stranded sequenceswithin homologous secondary and three dimensional structures.

[0094] Sequence variations of course are vast. If a single stranded loopof an RNA hairpin contains eight nucleotides, 65,536 different sequencescomprise the saturated sequence “space.” Although not bound to thetheory of this assertion, the inventors of this Invention believe thateach member of that set will have, through energy minimization, a moststable structure, and the bulk of those structures will present subtlydistinct chemical surfaces to the solvent or to potential interactingtarget molecules such as proteins. Thus, when all 65,536 sequenceswithin a particular structural motif were tested against thebacteriophage T4 DNA polymerase, two sequences from that set boundbetter than all others. This suggests that structural aspects of thosetwo sequences are special for that target, and that the remaining 65,534sequences are not as well suited for binding to the target. It is almostcertain that within those 65,536 sequences are other individual membersor sets that would be best suited for interacting with other targets.

[0095] A key concept in this description of RNA structures is that everysequence will find its most stable structure, even though RNAs are oftendrawn so as to suggest a random coil or floppy, unstructured element.Homopolymers of RNA, unable to form Watson/Crick base pairs, are oftenfound to have a non-random structure attributed to stacking energygained by fixing the positions of adjacent bases over each other.Clearly sequences involving all four nucleotides may have local regionsof fixed structure, and even without Watson/Crick base pairs anon-uniform sequence may have more structure than is at first presumed.The case for fixed structures in RNA loops is even stronger. Theanticodon loops of tRNAs have a structure, and so do—presumably—the twowinning sequences that bind best to T4 DNA polymerase.

[0096] Antiparallel strands of complementary sequence in RNA yieldA-form helices, from which loop sequences emerge and return. Even if theloop sequences do not have a strong capacity to interact, energyminimization is an energetically free structure optimization (that is,no obvious energies of activation block energy minimization of a loopsequence). A kinetically likely starting point for optimization may bethe loop closing base pair of an RNA stem, which presents a flat surfaceupon which optimal stacking of loop nucleotides and bases may occur.Loops of RNA are in principle equivalent to loops of protein connectingantiparallel alpha-helices or beta-strands. Although these protein loopsare often called random coils, they are neither random nor coiled. Suchloops are called “omega” structures, reflecting that the loop emergesand returns to positions that are relatively close to each other (See,Leszczynski, J. and Rose, G. et al. (1986) Science 234:849-855); thosepositions in a protein are conceptually equivalent to the loop closingbase pair of an RNA hairpin.

[0097] Many omega structures have been solved by X-ray diffraction, andthe structures are idiosyncratic. Clearly each structure is the resultof a unique energy minimization acted upon a loop whose ends are closeto each other. Both in proteins and RNAs those loops will energyminimize without information from the rest of the structure except, to afirst approximation, the loop closing pair of amino acids or base pair.For both protein omega loops and RNA hairpin loops all the freelyrotatable bonds will participate in the attempt to minimize the freeenergy. RNA, it seems, will be rather more responsive to electrostaticsthan proteins, while proteins will have many more degrees of freedomthan RNAs. Thus, calculations of RNA structures through energyminimization are more likely to yield accurate solution structures thanare comparable calculations for proteins.

[0098] Single-stranded regions of both RNAs and protein may be held soas to extend the possible structure. That is, if a single-stranded loopemerges and returns in a protein structure from parallel strands ofalpha-helix or beta-strands, the points of emergence and return arefurther from each other than in the omega structures. Furthermore, thedistance spanned by the single strand of peptide can be varied by thelengths of parallel alpha-helix or beta-strand.

[0099] For those protein structures in which the single strand lies upona fixed protein secondary structure, the resultant energy minimizationcould, in principle, allow interactions between the single-strandeddomain and the underlying structure. It is likely that amino acid sidechains that can form salt bridges in secondary structures could do thesame in extended single strands lying on top of regular secondarystructures. Thus the exact structures of such protein regions will againbe idiosyncratic, and very much sequence dependent. In this case thesequence dependence will include both the single strand and theunderlying sequence of the secondary structure.

[0100] Interestingly, an RNA structure known as a pseudoknot isanalogous to these extended protein motifs, and may serve to displaytoward solvent or target molecules extended single strands of RNA whosebases are idiosyncratically arrayed toward either the solvent/target oran underlying RNA secondary structure. Pseudoknots have, in common withprotein motifs based on loops between parallel strands, the capacity toalter the length of single strand and the sequence of the helix uponwhich it lies.

[0101] Thus, exactly like in protein motifs, by covariation withsequences in the underlying secondary structure it is possible todisplay single-stranded nucleotides and bases toward either the solventor the underlying structure, thus altering the electrostatics and thefunctional chemical groups that are interacting with targets. It isimportant to note that such structure variations follow from energyminimizations, but only one pseudoknot structure is known, even at lowresolution. Nevertheless, the value of this Invention arises out of therecognition that the shape and functional displays possible frompseudoknots are recognized to be nearly infinite in unique qualities.

[0102] Both hairpin loops and the single-stranded domain of pseudoknotsare built upon antiparallel RNA helices. Helices of RNA may containirregularities, called bulges. Bulges can exist in one strand of a helixor both, and will provide idiosyncratic structural features useful fortarget recognition. Additionally, helix irregularities can provideangled connections between regular helices.

[0103] A large bulge (see FIG. 13) on one strand of RNA may becomparable to hairpin loops, except that the loop closing base pair isreplaced by the two base pairs flanking the bulge.

[0104] Asymmetric bulges (see FIG. 13) may provide an elongated andirregular structure that is stabilized by nucleotide contacts across thebulge. These contacts may involve Watson/Crick interactions or any otherstabilizing arrangement, including other hydrogen bonds and basestacking.

[0105] Finally, when contemplating fixed RNA shapes or motifs, it isinstructive to consider what substantial differences exist between RNAand proteins. Since protein is thought to have displaced RNA duringevolution for those activities now carried out almost entirely byproteins and peptides, including catalysis and highly specificrecognition, the chemical properties of proteins are thought to be moreuseful than RNA for constructing variable shapes and activities. Thestandard reasoning includes the existence of 20 amino acids versus onlyfour nucleotides, the strong ionic qualities of lysine, arginine,aspartic acid, and glutamic acid which have no counterpart in the RNAbases, the relative neutrality of the peptide backbone when compared tothe strongly negative sugar-phosphate backbone of nucleic acids, theexistence of histidine with a pK near neutrality, the fact that the sidechains of the amino acids point toward the solvent in both alpha-helicesand beta-strands, and the regular secondary structures of proteins. Inthe double stranded nucleic acids, including RNA, base pairs point thebases toward each other and utilize much of the chemical informationpresent at the one dimensional level. Thus, from every angle presentlyunderstood to contribute to shape diversity and function, proteins arethought to be the vastly superior chemical to nucleic acids, includingRNA. During evolution, proteins were chosen for recognition andcatalysis over RNA, thus supporting the present widely held view.

[0106] Conversely, and central to this Invention, the vast number ofsequences and shapes possible for RNA will conceivably allow, especiallywith sequences never tested during evolutionary history, every desiredfunction and binding affinity even though RNA is made up of only fournucleotides and even though the backbone of an RNA is so highly charged.That is, the RNA motifs described above, with appropriate sequencespecifications, will yield in space those chemical functions needed toprovide tight and specific binding to most targets. It may be suggestedthat RNA is as versatile as the immune system. That is, while the immunesystem provides a fit to any desired target, RNA provides those sameopportunities. The enabling methodology described herein can utilize10¹⁸ sequences, and thus try vast numbers of structures such thatwhatever intrinsic advantages proteins or specifically antibodies mayhave over RNA are compensated for by the vastness of the possible “pool”from which RNA ligands are selected. In addition, with the use ofmodified nucleotides, RNA can be used that is intrinsically morechemically varied than natural RNAs.

[0107] The SELEX method involves the combination of a selection ofnucleic acid ligands which bind to a target molecule, for example aprotein, with amplification of those selected nucleic acids. Iterativecycling of the selection/amplification steps allows selection of one ora small number of nucleic acids which bind most strongly to the targetfrom a pool which contains a very large number of nucleic acids.

[0108] Cycling of the selection/amplification procedure is continueduntil a selected goal is achieved. For example, cycling can be continueduntil a desired level of binding of the nucleic acids in the testmixture is achieved or until a minimum number of nucleic acid componentsof the mixture is obtained (in the ultimate case until a single speciesremains in the test mixture). In many case, it will be desired tocontinue cycling until no further improvement of binding is achieved. Itmay be the case that certain test mixtures of nucleic acids show limitedimprovement in binding over background levels during cycling of theselection/amplification. In such cases, the sequence variation in thetest mixture should be increased including more of the possible sequencevariants or the length of the sequence randomized region should beincreased until improvements in binding are achieved. Anchoringprotocols and/or walking techniques can be employed as well.

[0109] Specifically, the method requires the initial preparation of atest mixture of candidate nucleic acids. The individual test nucleicacids can contain a randomized region flanked by sequences conserved inall nucleic acids in the mixture. The conserved regions are provided tofacilitate amplification or selected nucleic acids. Since there are manysuch sequences known in the art, the choice of sequence is one whichthose of ordinary skill in the art can make, having in mind the desiredmethod of amplification. The randomized region can have a fully orpartially randomized sequence. Alternatively, this portion of thenucleic acid can contain subportions that are randomized, along withsubportions which are held constant in all nucleic acid species in themixture. For example, sequence regions known to bind, or selected forbinding, to the target protein can be integrated with randomized regionsto achieve improved binding or improved specificity of binding. Sequencevariability in the test mixture can also be introduced or augmented bygenerating mutations in the nucleic acids in the test mixture during theselection/amplification process. In principle, the nucleic acidsemployed in the test mixture can be any length as long as they can beamplified. The method of the present invention is most practicallyemployed for selection from a large number of sequence variants. Thus,it is contemplated that the present method will preferably be employedto assess binding of nucleic acid sequences ranging in length from aboutfour bases to any attainable size.

[0110] The randomized portion of the nucleic acids in the test mixturecan be derived in a number of ways. For example, full or partialsequence randomization can be readily achieved by direct chemicalsynthesis of the nucleic acid (or portions thereof) or by synthesis of atemplate from which the nucleic acid (or portions thereof) can beprepared by use of appropriate enzymes. End addition, catalyzed byterminal transferase in the presence of nonlimiting concentrations ofall four nucleotide triphosphates can add a randomized sequence to asegment. Sequence variability in the test nucleic acids can also beachieved by employing size-selected fragments of partially digested (orotherwise cleaved) preparations of large, natural nucleic acids, such asgenomic DNA preparations or cellular RNA preparations. In those cases inwhich randomized sequence is employed, it is not necessary (or possiblefrom long randomized segments) that the test mixture contains allpossible variant sequences. It will generally be preferred that the testmixture contain as large a number of possible sequence variants as ispractical for selection, to insure that a maximum number of potentialbinding sequences are identified. A randomized sequence of 30nucleotides will contain a calculated 10¹⁸ different candidatesequences. As a practical matter, it is convenient to sample only about10¹⁸ candidates in a single selection. Practical considerations includethe number of templates on the DNA synthesis column, and the solubilityof RNA and the target in solution. (Of course, there is no theoreticallimit for the number of sequences in the candidate mixture.) Therefore,candidate mixtures that have randomized segments longer than 30 containtoo many possible sequences for all to be conveniently sampled in oneselection. It is not necessary to sample all possible sequences of acandidate mixture to select a nucleic acid ligand of the invention. Itis basic to the method that the nucleic acids of the test mixture arecapable of being amplified. Thus, it is preferred that any conservedregions employed in the test nucleic acids do not contain sequenceswhich interfere with amplification.

[0111] The various RNA motifs described above can almost always bedefined by a polynucleotide containing about 30 nucleotides. Because ofthe physical constraints of the SELEX process, a randomized mixturecontaining about 30 nucleotides is also about the longest contiguousrandomized segment which can be utilized while being able to testsubstantially all of the potential variants. It is, therefore, apreferred embodiment of this invention when utilizing a candidatemixture with a contiguous randomized region, to use a randomizedsequence of at least 15 nucleotides and containing at least about 10⁹nucleic acids, and in the most preferred embodiment contains at least 25nucleotides.

[0112] This Invention includes candidate mixtures containing allpossible variations of a contiguous randomized segment of at least 15nucleotides. Each individual member in the candidate mixture may also becomprised of fixed sequences flanking the randomized segment that aid inthe amplification of the selected nucleic acid sequences.

[0113] Candidate mixtures may also be prepared containing bothrandomized sequences and fixed sequences wherein the fixed sequencesserve a function in addition to the amplification process. In oneembodiment of the Invention, the fixed sequences in a candidate mixturemay be selected in order to enhance the percentage of nucleic acids inthe candidate mixture possessing a given nucleic acid motif. Forexample, the incorporation of the appropriate fixed nucleotides willmake it possible to increase the percentage of pseudoknots or hairpinloops in a candidate mixture. A candidate mixture that has been preparedincluding fixed sequences that enhance the percentage of a given nucleicacid structural motif is, therefore, a part of this invention. Oneskilled in the art, upon routine inspection of a variety of nucleicantibodies as described herein, will be able to construct, without undueexperimentation, such a candidate mixture. Examples 2 and 8 belowdescribe specific examples of candidate mixtures engineered to maximizepreferred RNA motifs.

[0114] Candidate mixtures containing various fixed sequences or using apurposefully partially randomized sequence may also be employed after aligand solution or partial ligand solution has been obtained by SELEX. Anew SELEX process may then be initiated with a candidate mixtureinformed by the ligand solution.

[0115] Polymerase chain reaction (PCR) is an exemplary method foramplifying of nucleic acids. Descriptions of PCR methods are found, forexample in Saiki et al. (1985) Science 230:1350-1354; Saiki et al.(1986) Nature 324:163-166; Scharf et al. (1986) Science 233:1076-1078;Innis et al. (1988) Proc. Natl. Acad. Sci. 85:9436-9440; and in U.S.Pat. No. 4,683,195 (Mullis et al.) and U.S. Pat. No. 4,683,202 (Mulliset al.). In its basic form, PCR amplification involves repeated cyclesof replication of a desired single-stranded DNA (or cDNA copy of an RNA)employing specific oligonucleotide primers complementary to the 3′ and5′ ends of the ssDNA, primer extension with a DNA polymerase, and DNAdenaturation. Products generated by extension from one primer serve astemplates for extension from the other primer. A related amplificationmethod described in PCT published application WO 89/01050 (Burg et al.)requires the presence or introduction of a promoter sequence upstream ofthe sequence to be amplified, to give a double-stranded intermediate.Multiple RNA copies of the double-stranded promoter containingintermediate are then produced using RNA polymerase. The resultant RNAcopies are treated with reverse transcriptase to produce additionaldouble-stranded promoter containing intermediates which can then besubject to another round of amplification with RNA polymerase.Alternative methods of amplification include among others cloning ofselected DNAs or cDNA copies of selected RNAs into an appropriate vectorand introduction of that vector into a host organism where the vectorand the cloned DNAs are replicated and thus amplified (Guatelli, J. C.et al. (1990) Proc. Natl. Acad. Sci. 87:1874). In general, any meansthat will allow faithful, efficient amplification of selected nucleicacid sequences can be employed in the method of the present invention.It is only necessary that the proportionate representation of sequencesafter amplification at least roughly reflects the relative proportionsof sequences in the mixture before amplification.

[0116] Specific embodiments of the present invention for amplifying RNAswere based on Innis et al. (1988) supra. The RNA molecules and targetmolecules in the test mixture were designed to provide, afteramplification and PCR, essential T7 promoter sequences in their 5′portions. Full-length cDNA copies of selected RNA molecules were madeusing reverse transcriptase primed with an oligomer complementary to the3′ sequences of the selected RNAs. The resultant cDNAS were amplified byTa DNA polymerase chain extension, providing the T7 promoter sequencesin the selected DNAs. Double-stranded products of this amplificationproces were then transcribed in vitro. Transcripts were used in the nextselection/amplification cycle. The method can optionally includeappropriate nucleic acid purification steps.

[0117] In general any protocol which will allow selection of nucleicacids based on their ability to bind specifically to another molecule,i.e., a protein or in the most general case any target molecule, can beemployed in the method of the present invention. It is only necessarythat the selection partition nucleic acids which are capable of beingamplified. For example, a filter binding selection, as described inExample 1, in which a test nucleic acid mixture is incubated with targetprotein, the nucleic acid/protein mixture is then filtered. through anitrocellulose filter and washed with appropriate buffer to remove freenucleic acids. Protein/nucleic acid often remain bound to the filter.The relative concentrations of protein to test nucleic acid in theincubated mixture influences the strength of binding that is selectedfor.

[0118] When nucleic acid is in excess, competition for available bindingsites occurs and those nucleic acids which bind most strongly areselected. Conversely, when an excess of protein is employed, it isexpected that any nucleic acid that binds to the protein will beselected. The relative concentrations of protein to nucleic acidemployed to achieve the desired selection will depend on the type ofprotein, the strength of the binding interaction and the level of anybackground binding that is present. The relative concentrations neededto achieve the desired selection result can be readily determinedempirically without under experimentation. Similarly, it may benecessary to optimize the filter washing procedure to minimizebackground binding. Again such optimization of the filter washingprocedures is within the skill of the ordinary artisan.

[0119] A mathematical evaluation of SELEX referred to as SELEXION hasbeen utilized by the inventors of the present invention. Appendix A tothis application includes a brief review of the mathematical analysisutilized to obtain generalizations regarding SELEX derived fromSELEXION.

[0120] The generalizations obtained from SELEXION are as follows: 1) Thelikelihood of recovering the best-binding RNA in each round of SELEXincreases with the number of such molecules present, with their bindingadvantage versus the bulk RNA pool, and with the total amount of proteinused. Although it is not always intuitively obvious to know in advancehow to maximize the difference in binding, the likelihood of recoveringthe best-binding RNA still can be increased by maximizing the number ofRNA molecules and target molecules sampled; 2) the ideal nucleic acidand protein concentrations to be used in various rounds of SELEX aredependent on several factors. The experimental parameters suggested bySELEXION parallel those employed in the Examples hereto. For example,when the relative affinity of the ultimate ligand solution is notknown—which will almost inevitably be the case when SELEX isperformed—it is preferred that the protein and nucleic acid candidatemixture concentrations are selected to provide a binding between about 3and 7 percent of the total of nucleic acids to the protein target. Byusing this criterion it can be expected that a tenfold to twentyfoldenrichment in high affinity ligands will be achieved in each round ofSELEX.

[0121] The experimental conditions used to select nucleic acid ligandsto various targets in the preferred embodiment are to be selected tomimic the environment that the target would be found in vivo. Example 10below indicates how changing the selection conditions will affect theligand solution received to a particular target. Although the ligandsolution to NGF had significant similarities under high and low saltconditions, differences were observed. Adjustable conditions that may bealtered to more accurately reflect the in vivo environment of the targetinclude, but are not limited to, the total ionic strength, theconcentration of bivalent cations and the pH of the solution. Oneskilled in the art would be able to easily select the appropriateseparation conditions based on a knowledge of the given target.

[0122] In order to proceed to the amplification step, selected nucleicacids must be released from the target after partitioning. This processmust be done without chemical degradation of the selected nucleic acidsand must result in amplifiable nucleic acids. In a specific embodiment,selected RNA molecules were eluted from nitrocellulose filters using afreshly made solution containing 200 μl of a 7 M urea, 20 mM sodiumcitrate (pH 5.0), and 1 mM EDTA solution combined with 500 μl of phenol(equilibrated with 0.1 M sodium acetate pH 5.2). A solution of 200 μl 7Murea with 500 μl of phenol has been successfully employed. The elutedsolution of selected RNA was then extracted with ether, ethanolprecipitated and the precipitate was resuspended in water. A number ofdifferent buffer conditions for elution of selected RNA from the filterscan be used. For example, without limitation nondetergent aqueousprotein denaturing agents such as quanidinium chloride, quanidiniumthiocyanate, etc., as are known in the art, can be used. The specificsolution used for elution of nucleic acids from the filter can beroutinely selected by one of ordinary skill in the art.

[0123] Alternative partitioning protocols for separating nucleic acidsbound to targets, particularly proteins, are available to the art. Forexample, binding and partitioning can be achieved by passage of the testnucleic acid mixture through a column which contains the target moleculebound to a solid support material. Those nucleic acid that bind to thetarget will be retained on the column and unbound nucleic acids can bewashed from the column.

[0124] Throughout this application, the SELEX process has been definedas an iterative process wherein selection and amplification are repeateduntil a desired selectivity has been attained. In one embodiment of theinvention, the selection process may be efficient enough to provide aligand solution after only one separation step. For example, in theory acolumn supporting the target through which the candidate mixture isintroduced—under the proper conditions and with a long enoughcolumn—should be capable of separating nucleic acids based on affinityto the target sufficiently to obtain a ligand solution. To the extentthat the original selection step is sufficiently selective to yield aligand solution after only one step, such a process would also beincluded within the scope of this invention.

[0125] In one embodiment of this invention, SELEX is iterativelyperformed until a single or a discrete small number of nucleic acidligands remain in the candidate mixture following amplification. In suchcases, the ligand solution will be represented as a single nucleic acidsequence, and will not include a family of sequences having comparablebinding affinities to the target.

[0126] In an alternate embodiment of the invention, SELEX iterations areterminated at some point when the candidate mixture has been enriched inhigher binding affinity nucleic acid ligands, but still contains arelatively large number of distinct sequences. This point can bedetermined by one of skill in the art by periodically analyzing thesequence randomness of the bulk candidate mixture, or by assaying bulkaffinity to the target.

[0127] At this time, SELEX is terminated, and clones are prepared andsequenced. Of course, there will be an almost unlimited number of clonesthat could be sequenced. As seen in the Examples below, however, aftersequencing between 20 and 50 clones it is generally possible to detectthe most predominant sequences and defining characteristics of theligand solution. In a hypothetical example, after cloning 30 sequencesit will be found that 6 sequences are identical, while certain sequenceportions of 20 of the other sequences are closely related to sequenceswithin the “winning” sequence. Although the most predominant sequencemay be considered a ligand solution to that target, it is often moreappropriate to construct or describe a ligand solution that consists ofa family of sequences that includes the common characteristics of manyof the cloned sequences.

[0128] In a further embodiment of this invention, a ligand solution thatis represented as a family of sequences having a number of definingcharacteristics (e.g., where the ligand solution is AAGUNNGUNNCNNNN (SEQID NO:2), where N can apparently be any of the four nucleotides) may beused to initiate an additional SELEX process. In this embodiment, thecandidate mixture would be comprised of partially fixed and partiallyrandom nucleotides, the fixed nucleotides being selected based on theligand solution received in the initial SELEX process. In this manner,if there is a single nucleotide sequence that binds better than theother members of the ligand solution family, it will be quicklyidentified.

[0129] In an alternate further embodiment of the invention, a secondSELEX experiment based on the ligand solution received in a SELEXprocess is also utilized. In this embodiment, the single mostpredominant sequence (e.g., AAGUCCGUAACACAC) (SEQ ID NO:3) is used toinform the second SELEX process. In this second SELEX process thecandidate mixture is prepared in order to yield sequences based on theselected winner, while assuring that there will be sufficientrandomization at each of the sequences. This candidate mixture may beproduced by using nucleotide starting materials that are biased ratherthan randomized. For example, the A solution contains 75% A and 25% U, Cand G. Although the nucleic acid synthesizer is set to yield thepredominant nucleotide, the presence of the other nucleotides in the Asolution will yield nucleic acid sequences that are predominant in A butthat will also yield variations in this position. Again, this secondSELEX round, informed by the results obtained in the initial SELEXprocess, will maximize the probabilities of obtaining the best ligandsolution to a given target. Again, it must be clarified that the ligandsolution may consist of a single preferred nucleic acid ligand, or itmay consist of a family of structurally related sequences withessentially similar binding affinities.

[0130] In practice, it may occasionally be preferred that the SELEXprocess not be performed until a single sequence is obtained. The SELEXprocess contains several bias points that may affect the predominance ofcertain sequences in a candidate mixture after several rounds of SELEXthat are not related to the binding affinity of that sequence to thetarget. For example, a bias for or against certain sequences may occurduring the production of cDNA from the RNA recovered after selection, orduring the amplification process. The effects of such unpredictablebiases can be minimized by halting SELEX prior to the time that only oneor a small number of sequences predominate in the reaction mixture.

[0131] As stated above, sequence variation in the test nucleic acidmixture can be achieved or increased by mutation. For example, aprocedure has been described for efficiently mutagenizing nucleic acidsequences during PCR amplification (Leung et al. 1989). This method orfunctionally equivalent methods can optionally be combined withamplification procedures in the present invention.

[0132] Alternatively conventional methods of DNA mutagenesis can beincorporated into the nucleic acid amplification procedure. Applicablemutagenesis procedures include, among others, chemically inducedmutagenesis and oligonucleotide site-directed mutagenesis.

[0133] The present invention can also be extended to utilize additionalinteresting capacities of nucleic acids and the manner in which they areknown or will later be found to interact with targets such as proteins.For example, a SELEX methodology may be employed to screen for ligandsthat form Michael adducts with proteins. Pyrimidines, when they sit inthe correct place within a protein, usually adjacent to a criticalcysteine or other nucleophile, can react with that nucleophile to form aMichael adduct. The mechanism by which Michael adducts are formedinvolves a nucleophilic attack at the 6 position of the pyrimidine baseto create a transient (but slowly reversing) intermediate that is reallya 5,6-dihidropyrimidine. It is possible to test for the presence of suchintermediates by observing whether binding between an RNA and a proteintarget occurs even after the protein is denatured with any appropriatedenaturant. That is, one searches for a continued covalent interactionwhen the binding pocket of the target has been destroyed. However,Michael adducts are often reversible, and sometimes so quickly that thefailure to identify a Michael adduct through this test does not indicatethat one was not present at a prior moment.

[0134] SELEX may be done so as to take advantage of Michael adductformation in order to create very high affinity, near-suicide substratesfor an enzyme or other protein target. Imagine that after bindingbetween a randomized mixture of RNAs and the target, prior topartitioning on a filter or by other means, the target is denatured.Subsequent partitioning, followed by reversal of the Michael adduct andcDNA synthesis on the released RNA, followed by the rest of the SELEXcycle, will enrich for RNAs that bind to a target prior to denaturationbut continue to bind covalently until the Michael adduct is reversed bythe scientist. This ligand, in vivo, would have the property ofpermanently inhibiting the target protein. The protein tRNA-uracilmethyl transferase (RUMT) binds substrate tRNAs through a Michaeladduct. When RUMT is expressed at high levels in E. coli the enzyme isfound largely covalently bound to RNA, suggesting strongly that nearlyirreversible inhibitors can be found through SELEX.

[0135] The method of the present invention has multiple applications.The method can be employed, for example, to assist in the identificationand characterization of any protein binding site for DNA or RNA. Suchbinding sites function in transcriptional or translational regulation ofgene expression, for example as binding sites for transcriptionalactivators or repressors, transcription complexes at promoter sites,replication accessory proteins and DNA polymerases at or near origins ofreplication and ribosomes and translational repressors at ribosomebinding sites. Sequence information of such binding sites can be used toisolate and identify regulatory regions bypassing more labor-intensivemethods of characterization of such regions. Isolated DNA regulatoryregions can be employed, for example, in heterologous constructs toselectively alter gene expression.

[0136] It is an important and unexpected aspect of the present inventionthat the methods described herein can be employed to identify, isolateor produce nucleic acid molecules which will bind specifically to anydesired target molecule. Thus, the present methods can be employed toproduce nucleic acids specific for binding to a particular target. Sucha nucleic acid ligand in a number of ways functionally resembles anantibody. Nucleic acid ligands which have binding functions similar tothose of antibodies can be isolated by the methods of the presentinvention. Such nucleic acid ligands are designated herein nucleic acidantibodies and are generally useful in applications in which polyclonalor monoclonal antibodies have found application. Nucleic acid antibodiescan in general be substituted for antibodies in any in vitro or in vivoapplication. It is only necessary that under the conditions in which thenucleic acid antibody is employed, the nucleic acid is substantiallyresistant to degradation. Applications of nucleic acid antibodiesinclude the specific, qualitative or quantitative detection of targetmolecules from any source; purification of target molecules based ontheir specific binding to the nucleic acid; and various therapeuticmethods which rely on the specific direction of a toxin or othertherapeutic agent to a specific target site.

[0137] Target molecules are preferably proteins, but can also includeamong others carbohydrates, peptidoglycans and a variety of smallmolecules. As with conventional proteinaceous antibodies, nucleic acidantibodies can be employed to target biological structures, such as cellsurfaces or viruses, through specific interaction with a molecule thatis an integral part of that biological structure. Nucleic acidantibodies are advantageous in that they are not limited by selftolerance, as are conventional antibodies. Also nucleic acid antibodiesdo not require animals or cell cultures for synthesis or production,since SELEX is a wholly in vitro process. As is well-known, nucleicacids can bind to complementary nucleic acid sequences. This property ofnucleic acids has been extensively utilized for the detection,quantitation and isolation of nucleic acid molecules. Thus, the methodsof the present invention are not intended to encompass these well-knownbinding capabilities between nucleic acids. Specifically, the methods ofthe present invention related to the use of nucleic acid antibodies arenot intended to encompass known binding affinities between nucleic acidmolecules. A number of proteins are known to function via binding tonucleic sequences, such as regulatory proteins which bind to nucleicacid operator sequences. The known ability of certain nucleic acidbinding proteins to bind to their natural sites, for example, has beenemployed in the detection, quantitation, isolation and purification ofsuch proteins. The methods of the present invention related to the useof nucleic acid antibodies are not intended to encompass the knownbinding affinity between nucleic acid binding proteins and nucleic acidsequences to which they are known to bind. However, novel,non-naturally-occurring sequences which bind to the same nucleic acidbinding proteins can be developed using SELEX. It should be noted thatSELEX allows very rapid determination of nucleic acid sequences thatwill bind to a protein and, thus, can be readily employed to determinethe structure of unknown operator and binding site sequences whichsequences can then be employed for applications as described herein. Itis believed that the present invention is the first disclosure of thegeneral use of nucleic acid molecules for the detection, quantitation,isolation and purification of proteins which are not known to bindnucleic acids. As will be discussed below, certain nucleic acidantibodies isolatable by SELEX can also be employed to affect thefunction, for example inhibit, enhance or activate the function, ofspecific target molecules or structures. Specifically, nucleic acidantibodies can be employed to inhibit, enhance or activate the functionof proteins.

[0138] Proteins that have a known capacity to bind nucleic acids (suchas DNA polymerases, other replicases, and proteins that recognize siteson RNA but do not engage in further catalytic action) yield, via SELEX,high affinity RNA ligands that bind to the active site of the targetprotein. Thus, in the case of HIV-1 reverse transcriptase the resultantRNA ligand (called 1.1 in Example 2) blocks cDNA synthesis in thepresence of a primer DNA, an RNA template, and the four deoxynucleotidetriphosphates.

[0139] The inventors' theory of RNA structures suggests that nearlyevery protein will serve as a target for SELEX. The initial experimentsagainst non-nucleic acid binding protein were performed with threeproteins not thought to interact with nucleic acids in general or RNA inparticular. The three proteins were tissue plasminogen activator (tPA),nerve growth factor (NGF), and the extracellular domain of the growthfactor receptor (gfR-Xtra). All of these proteins were tested to see ifthey would retain mixed randomized RNAs on a nitrocellulose filter. tPAand NGF showed affinity for randomized RNA, with Kd's just below uM.gfR-Xtra did not bind with measurable affinity, suggesting that if anRNA antibody exists for that protein it must bind to a site that has noaffinity for most other RNAs.

[0140] tPA and NGF were taken through the SELEX drill using RNAs with 30randomized positions. Both tPA and NGF gave ligand solutions in theSELEX drill, suggesting that some site on each protein bound the winningsequences more tightly than that site (or another site) bound otherRNAs. The winning sequences are different for the two proteins.

[0141] Since tPA and NGF worked so well in the SELEX drill, a randomcollection of proteins and peptides were tested to see if they had anyaffinity for RNA. It was reasoned that if a protein has any affinity forRNA that the SELEX drill will, on the average, yield higher affinitysequences which contact the same region of the target that provides thelow, generalized affinity. A set of proteins and peptides were tested tosee if randomized RNAs (containing 40 randomized positions) would beretained on nitrocellulose filters. About two thirds of the proteinstested bound RNA, and a few proteins bound RNA very tightly. See Example9.

[0142] Proteins that do not bind RNA to nitrocellulose filters may failfor trivial reasons having nothing to do with the likelihood of raisingRNA antibodies. One example, bradykinin, fails to bind to nitrocellulosefilters, and thus would fail in the above experiment. A bradykininlinked to a solid matrix through the amino terminus of the peptide wasprepared, and then found that randomized RNA bound tightly to the matrix(see Example 7). Thus in the initial experiments two short peptides,bradykinin and bombesin, bind randomized RNAs quite tightly. Any highaffinity RNA ligand obtained through SELEX with these peptide targetswould, perhaps, be an antagonist of these active peptides, and might beuseful therapeutically. It is difficult to imagine an RNA of about 30nucleotides binding to a very small peptide without rendering thatpeptide inactive for virtually any activity.

[0143] As described in Examples 4, 7, 9 and 10 below, proteins notthought to interact with nucleic acids in nature were found to bind arandom mixture of nucleic acids to a non-trivial extent. It has furtherbeen shown that for such proteins that were found to bind RNA mixturesnon-specifically that a ligand solution can be obtained following SELEX.It is, therefore, a potentially valuable screen—prior to the performanceof SELEX—to determine if a given target shows any binding to a randommixture of nucleic acids.

[0144] It is a second important and unexpected aspect of the presentinvention that the methods described herein can be employed to identify,isolate or produce nucleic acid molecules which will bind specificallyto a particular target molecule and affect the function of thatmolecule. In this aspect, the target molecules are again preferablyproteins, but can also include, among others, carbohydrates and varioussmall molecules to which specific nucleic acid binding can be achieved.Nucleic acid ligands that bind to small molecules can affect theirfunction by sequestering them or by preventing them from interactingwith their natural ligands. For example, the activity of an enzyme canbe affected by a nucleic acid ligand that binds the enzyme's substrate.Nucleic acid ligands, i.e., nucleic acid antibodies, of small moleculesare particularly useful as reagents for diagnostic tests (or otherquantitative assays). For example, the presence of controlledsubstances, bound metabolites or abnormal quantities of normalmetabolites can be detected and measured using nucleic acid ligands ofthe invention. A nucleic acid ligand having catalytic activity canaffect the function of a small molecule by catalyzing a chemical changein the target. The range of possible catalytic activities is at least asbroad as that displayed by proteins. The strategy of selecting a ligandfor a transition state analog of a desired reaction is one method bywhich catalytic nucleic acid ligands can be selected.

[0145] It is believed that the present invention for the first timediscloses the general use of nucleic acid molecules to effect, inhibitor enhance protein function. The binding selection methods of thepresent invention can be readily combined with secondary selection orscreening methods for modifying target molecule function on binding toselected nucleic acids. The large population of variant nucleic acidsequences that can be tested by SELEX enhances the probability thatnucleic acid sequences can be found that have a desired bindingcapability and function to modify target molecule activity. The methodsof the present invention are useful for selecting nucleic acid ligandswhich can selectively affect function of any target protein includingproteins which bind nucleic acids as part of their natural biologicalactivity and those which are not known to bind nucleic acid as part oftheir biological function. The methods described herein can be employedto isolate or produce nucleic acid ligands which bind to and modify thefunction of any protein which binds a nucleic acid, either DNA or RNA,either single-stranded or double-stranded; a nucleoside or nucleotideincluding those having purine or pyrimidine bases or bases derivedtherefrom, specifically including those having adenine, thymine,guanine, uracil, cytosine and hypoxanthine bases and derivatives,particularly methylated derivatives, thereof; and coenzyme nucleotidesincluding among others nicotinamide nucleotides, flavin-adeninedinucleotides and coenzyme A. It is contemplated that the method of thepresent invention can be employed to identify, isolate or producenucleic acid molecules which will affect catalytic activity of targetenzymes, i.e., inhibit catalysis or modify substrate binding, affect thefunctionality of protein receptors, i.e., inhibit binding to receptorsor modify the specificity of binding to receptors; affect the formationof protein multimers, i.e., disrupt quaternary structure of proteinsubunits; and modify transport properties of protein, i.e., disrupttransport of small molecules or ions by proteins.

[0146] The SELEX process is defined herein as the iterative selectionand amplification of a candidate mixture of nucleic acid sequencesrepeated until a ligand solution has been obtained. A further step inthe process is the production of nucleic acid antibodies to a giventarget. Even when the ligand solution derived for a given process is asingle sequence, the nucleic acid antibody containing just the ligandsolution must be synthesized. For example, a SELEX experiment may give apreferred single ligand solution that consists of only 20 of the 30randomized nucleotide sequences used in the SELEX candidate mixture. Thetherapeutically valuable nucleic acid antibody would not, preferably,contain the 10 non-critical nucleotides or the fixed sequences requiredfor the amplification step of SELEX. Once the desired structure of thenucleic acid antibody is determined based on the ligand solution, theactual synthesis of the nucleic acid antibody will be performedaccording to a variety of techniques well known in the art.

[0147] The nucleic acid antibody may also be constructed based on aligand solution for a given target that consists of a family ofsequences. In such case, routine experimentation will show that a givensequence is preferred due to circumstances unrelated to the relativeaffinity of the ligand solution to the target. Such considerations wouldbe obvious to one of ordinary skill in the art.

[0148] In an alternate embodiment of the present Invention, the nucleicacid antibody may contain a plurality of nucleic acid ligands to thesame target. For example, SELEX may identify two discrete ligandsolutions. As the two ligand solutions may bind the target at differentlocations, the nucleic acid antibody may preferably contain both ligandsolutions. In another embodiment, the nucleic acid antibody may containmore than one of a single ligand solution. Such multivalent nucleic acidantibodies will have increased binding affinity to the targetunavailable to an equivalent nucleic acid antibody having only oneligand.

[0149] In addition, the nucleic acid antibody may also contain otherelements, that will 1) add independent affinity for the target to thenucleic acid antibody; 2) dependently enhance the affinity of thenucleic acid ligand to the target; 3) direct or localize the nucleicacid antibody to the proper location in vivo where treatment is desired;or 4) utilize the specifity of the nucleic acid ligand to the target toeffect some additional reaction at that location.

[0150] The methods of the present invention are useful for obtainingnucleic acids which will inhibit function of a target protein, and areparticularly useful for obtaining nucleic acids which inhibit thefunction of proteins whose function involves binding to nucleic acid,nucleotides, nucleosides and derivatives and analogs thereof. Themethods of the present invention can provide nucleic acid inhibitors,for example, of polymerases, reverse transcriptases, and other enzymesin which a nucleic acid, nucleotide or nucleoside is a substrate orco-factor.

[0151] Secondary selection methods that can be combined with SELEXinclude among others selections or screens for enzyme inhibition,alteration of substrate binding, loss of functionality, disruption ofstructure, etc. Those of ordinary skill in the art are able to selectamong various alternatives those selection or screening methods that arecompatible with the methods described herein.

[0152] It will be readily apparent to those of skill in the art that insome cases, i.e., for certain target molecules or for certainapplications, it may be preferred to employ RNA molecules in preferenceto DNA molecules as ligands, while in other cases DNA ligands may bepreferred to RNA.

[0153] The selection methods of the present invention can also beemployed to select nucleic acids which bind specifically to a molecularcomplex, for example to a substrate/protein or inhibitor/proteincomplex. Among those nucleic acids that bind specifically to the complexmolecules, but not the uncompleted molecules there are nucleic acidswhich will inhibit the formation of the complex. For example, amongthose nucleic acids ligands which are selected for specific binding to asubstrate/enzyme complex there are nucleic acids which can be readilyselected which will inhibit substrate binding to the enzyme and thusinhibit or disrupt catalysis by the enzyme.

[0154] An embodiment of the present invention, which is particularlyuseful for the identification or isolation of nucleic acids which bindto a particular functional or active site in a protein, or other targetmolecule, employs a molecule known, or selected, for binding to adesired site within the target protein to direct theselection/amplification process to a subset of nucleic acid ligands thatbind at or near the desired site within the target molecule. In a simpleexample, a nucleic acid sequence known to bind to a desired site in atarget molecule is incorporated near the randomized region of allnucleic acids being tested for binding. SELEX is then used (FIG. 9) toselect those variants, all of which will contain the known bindingsequence, which bind most strongly to the target molecule. A longerbinding sequence, which is anticipated to either bind more strongly tothe target molecule or more specifically to the target can thus beselected. The longer binding sequence can then be introduced near therandomized region of the nucleic acid test mixture and theselection/amplification steps repeated to select an even longer bindingsequence. Iteration of these steps (i.e., incorporation of selectedsequence into test mixtures followed by selection/amplification forimproved or more specific binding) can be repeated until a desired levelof binding strength or specificity is achieved. This iterative “walking”procedure allows the selection of nucleic acids highly specific for aparticular target molecule or site within a target molecule. Anotherembodiment of such an iterative “walking” procedure, employs an “anchor”molecule which is not necessarily a nucleic acid (see FIGS. 10 and 11).In this embodiment a molecule which binds to a desired target, forexample a substrate or inhibitor of a target enzyme, is chemicallymodified such that it can be covalently linked to an oligonucleotide ofknown sequence (the “guide oligonucleotide” of FIG. 10). The guideoligonucleotide chemically linked to the “anchor” molecule that binds tothe target also binds to the target molecule. The sequence complement ofguide oligonucleotide is incorporated near the randomized region of thetest nucleic acid mixture. SELEX is then performed to select for thosesequences that bind most strongly to the target molecule/anchor complex.The iterative walking procedure can then be employed to select orproduce longer and longer nucleic acid molecules with enhanced strengthof binding or specifity of binding to the target. The use of the“anchor” procedure is expected to allow more rapid isolation of nucleicacid ligands that bind at or near a desired site within a targetmolecule. In particular, it is expected that the “anchor” method incombination with iterative “walking” procedures will result in nucleicacids which are highly specific inhibitors of protein function (FIG.11).

[0155] In certain embodiments of the performance of SELEX it isdesirable to perform plus/minus screening in conjunction with theselection process to assure that the selection process is not beingskewed by some factor unrelated to the affinity of the nucleic acidsequences to the target. For example, when selection is performed byprotein binding nitrocellulose, it has been seen that certain nucleicacid sequences are preferentially retained by nitrocellulose and can beselected during the SELEX process. These sequences can be removed fromthe candidate mixture by incorporating additional steps wherein thepreceding SELEX mixture is passed through nitrocellulose to selectivelyremove those sequences selected solely for that property. Such screeningand selection may be performed whenever the target contains impuritiesor the selection process introduces biases unrelated to affinity to thetarget.

[0156] SELEX has been demonstrated by application to the isolation ofRNA molecules which bind to and inhibit the function of bacteriophage T4DNA polymerase, also termed gp43. The novel RNA ligand of T4 DNApolymerase is useful as a specific assay reagent for T4 DNA polymerase.The synthesis of T4 DNA polymerase is autogenously regulated. In theabsence of functional protein, amber fragments and mutant proteins areoverexpressed when compared to the rate of synthesis of wild-typeprotein in replication-deficient infections (Russel (1973) J. Mol. Biol.79:83-94). In vitro translation of an N-terminal fragment of gp43 isspecifically repressed by the addition of purified gp43, and gp43protects a discrete portion of the mRNA near its ribosome binding sitefrom nuclease attack (Andrake et al. (1988) Proc. Natl. Acad. Sci. USA85:7942-7946). The size and sequence of the RNA translational operatorto which gp43 binds and the strength of that binding have beenestablished. The minimal size of the gp43 operator is a sequence ofabout 36 nucleotides, as illustrated in FIG. 1, which is predicted tohave a hairpin loop structure as indicated therein. The minimal size ofthe operator was determined by analysis of binding of end-labeledhydrolysis fragments of the operator to gp43. Analysis of binding ofoperator mutants in the hairpin and loop sequence indicate that gp43binding to the operator is sensitive to primary base changes in thehelix. Binding to the polymerase was even more reduced by changes whichsignificantly reduce hairpin stability. Operator binding was found to bevery sensitive to loop sequence. It was found that replication andoperator binding in gp43 are mutually exclusive activities. The additionof micromolar amounts of purified RNAs containing intact operator wasfound to strongly inhibit in vitro replication by gp43.

[0157] The wild-type gp43 operator, FIG. 1, was employed as the basisfor the design of an initial mixture of RNA molecules containing arandomized sequence region to assess the ability of theselection/amplification process to isolate nucleic acid molecules thatbind to a protein. The RNA test mixture was prepared by in vitrotranscription from a 110 base single-stranded DNA template. The templatewas constructed as illustrated in FIG. 1 to encode most of the wild-typeoperator sequence, except for the loop sequence. The eight base loopsequence was replaced by a randomized sequence region which wassynthesized to be fully random at each base. The template also containedsequences necessary for efficient amplification: a sequence at its 3′end complementarily to a primer for reverse transcription andamplification in polymerase chain reactions and a sequence in its 5′ endrequired for T7 RNA polymerase transcriptional initiation and sufficientsequence complementary to the cDNA of the in vitro transcript. The DNAtemplate is this a mixture of all loop sequence variants, theoreticallycontaining 65,536 individual species.

[0158] The dissociation constant for the wild-type loop RNA was found tobe about 5×10⁻⁹M. The dissociation constant for the population of loopsequence variants was measured to be about 2.5×10⁻⁷. Randomization ofthe loop sequence lowered binding affinity 50-fold.

[0159] In vitro transcripts containing the loop sequence variants weremixed with purified gp43 and incubated. The mixture was filtered througha nitrocellulose filter. Protein-RNA complexes are retained on thefilter and unbound RNA is not. Selected RNA was then eluted from thefilters as described in Example 1. Selected RNAs were extended with AMVreverse transcriptase in the presence of 3′ primer as described in Gausset al. (1987) supra. The resulting cDNA was amplified with Taq DNApolymerase in the presence of the 5′ primer for 30 cycles as describedin Innis et al. (1986) supra. The selected amplified DNA served as atemplate for in vitro transcription to produce selected amplified RNAtranscripts which were then subject to another round of bindingselection/amplification. The RNA/protein ratio in the binding selectionmixture was held constant throughout the cycles of selection. Theiterative selection/amplification was performed using several differentRNA/protein molar ratios. In all experiments RNA was in excess:experiment A employed an RNA/gp43 of 10/1 (moles/moles); experiment Bemployed an RNA/gp43 of 1000/1; and experiment C employed an RNA/gp43 of100/1.

[0160] The progress of the selection process was monitored by filterbinding assays of labelled transcripts of amplified cDNA at thecompletion of each cycle of the procedure. Batch sequencing of the RNAproducts from each round for experiment B was also done to monitor theprogress of the selection. Autoradiograms of sequencing gels of RNAproducts after 2, 3 and 4 rounds of selection/amplification are shown inFIG. 3. It is clear that there was no apparent loop sequence biasintroduced until after the third selection. After the fourth round ofselection, an apparent consensus sequence for the eight base loopsequence is discernable as: A(a/g)(u/c)AAC(u/c)(u/c). Batch sequencingof selected RNA after the fourth round of selection for experiments A, Band C is compared in FIG. 4. All three independent SELEX proceduresusing different RNA/protein ratios gave similar apparent consensussequences. There was, however, some apparent bias for wild-type loopsequence (AAUAACUC) in the selected RNA from experiments A and C.

[0161] In order to determine what allowable sequence combinations wereactually present in the selected RNAs, individual DNAs were cloned fromselected RNAs after the fourth round of selection in experiment B. Thebatch sequence result from experiment B appeared to indicate an evendistribution of the two allowable nucleotides which composed each of thefour variable positions of the loop sequence. Individuals were clonedinto pUC 18 as described by Sambrook, J. et al. (1989) MolecularCloning: A Laboratory Manual, (Cold Spring Harbor, N.Y.), Sections 1.13;1.85-1.86. Twenty individual clones that were identified by colonyfilter hybridization to the 3′ primer were sequenced. None of thesequenced clones were mutant at any place in the operator sequenceoutside of the loop sequence. Only five variant sequences were observedas shown in FIG. 7, and surprisingly only two sequence variants were themajor components of the selected mixture. The frequencies of eachsequence in the 20 individual isolates sequenced are also given in FIG.7. The wild-type sequence AAUAACUC and the loop AGCAACCU were present inapproximately equal amount in the selected RNA of experiment B. Theother selected variants were 1 base mutants of the two major variants.The strength of binding of the sequence variants was compared in filterbinding assays using labelled in vitro transcripts derived from each ofthe purified clonal isolates. As shown in FIG. 6, a rough correlationbetween binding affinity of an RNA for gp43 and the abundance of theselected sequence was observed. The two major loop sequence variantsshowed approximately equal binding affinities for gp43.

[0162] The loop sequence variant RNAs isolated by theselection/amplification process, shown in FIG. 7, can all act asinhibitors of gp43 polymerase activity as has been demonstrated for thewild-type operator sequence.

[0163] An example of the use of SELEX has been provided by selection ofa novel RNA ligand of bacteriophage T4 DNA polymerase (gp43) (Andrake etal. (1988) Proc. Natl. Acad. Sci. USA 85:7942-7946).

[0164] The present invention includes specific ligand solutions, derivedvia the SELEX process, that are shown to have an increased affinity toHIV-1 reverse transcriptase, R17 coat protein, HIV-1 rev protein, HSVDNA polymerase, E. coli ribosomal protein S1, tPA and NGF. These ligandsolutions can be utilized by one of skill in the art to synthesizenucleic acid antibodies to the various targets.

[0165] The following examples describe the successful application ofSELEX to a wide variety of targets. The targets may generally be dividedinto two categories—those that are nucleic acid binding proteins andthose proteins not known to interact with nucleic acids. In each case aligand solution is obtained. In some cases it is possible to representthe ligand solution as a nucleic acid motif such as a hairpin loop, anasymmetric bulge or a pseudoknot. In other examples the ligand solutionis presented as a primary sequence. In such cases it is not meant to beimplied that the ligand solution does not contain a definitive tertiarystructure.

[0166] In addition to T4 DNA polymerase, targets on which SELEX has beensuccessfully performed include bacteriophage R17 coat protein, HIVreverse transcriptase (HIV-RT), HIV-1 rev protein, HSV DNA polymeraseplus or minus cofactor, E. coli ribosomal protein S1, tPA and NGF. Thefollowing experiments also describe a protocol for testing the bulkbinding affinity of a randomized nucleic acid candidate mixture to avariety of proteins. Example 7 also describes the immobilization ofbradykinin and the results of bulk randomized nucleic acid bindingstudies on bradykinin.

[0167] The examples and illustrations herein are not to be taken aslimiting in any way. The fundamental insight underlying the presentinvention is that nucleic acids as chemical compounds can form avirtually limitless variety of sizes, shapes and configurations and arecapable of an enormous repertoire of binding and catalytic functions, ofwhich those known to exist in biological systems are merely a glimpse.

EXAMPLES

[0168] The following materials and methods were used throughout. Thetranscription vector pT7-2 is commercially available (U.S. BiochemicalCompany, Cleveland, Ohio). Plasmid pUC18 is described by Norrander etal. (1983) Gene 24:15-27 and is also commercially available from NewEngland Biolabs. All manipulations of DNA to create new recombinantplasmids were as described in Maniatis et al. (1982) Molecular Cloning:A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor,N.Y., except as otherwise noted. DNA olignucleotides were synthesizedand purified as described in Gauss et al. (1987) Mol. Gen. Genet.206:24-34.

[0169] In vitro transcriptions with T7 RNA polymerase and RNAgel-purification were performed as described in Milligan et al. (1987)Nucl. Acids Res. 15:8783-8798, except that in labeling reactions theconcentrations of ATP, CTP, and GTP were 0.5 mM each, and the UTPconcentration was 0.05 mM. The UTP was labeled at the alpha positionwith ³²P at a specific activity of approximately 20 Ci/mmol. Crude mRNApreparations from T4 infections, labeling of oligos, and primerextension with AMV reverse transcriptase were all according to Gauss etal. (1987) supra.

[0170] Dilutions of labeled, gel-purified RNA and purified gp43 weremade in 200 mM potassium acetate, 50 mM Tris-HCl pH 7.7 at 4° C. Innitrocellulose filter binding assays, purified gp43 was serially dilutedand 30 μl aliquots of each dilution of protein were added to 30 μlaliquots of diluted, labeled, gel-purified RNA. The RNA dilution (50 μl)was spotted on a fresh nitrocellulose filter, dried and counted todetermine input counts per tube. The concentration of protein in thereactions ranged from 10⁻¹⁰ M to 10⁻⁸ M and the concentration of theRNAs in each experiment was approximately 10⁻¹² M. After incubation at4° C. for 30 minutes, each tube was placed at 37° C. for 3 minutes and50 μl of each sample filtered through pre-wet nitrocellulose filters(Millipore #HAWP 025 00) and washed with 3 ml of 200 mM potassiumacetate, 50 mM Tris-HCl pH 7.7. The filters were dried and counted inEcolume™ scintillation fluid (ICN Biomedicals, Inc.). Controls were donein the absence of gp43, from which the background (always less thanabout 5% of the input counts) was determined. From each set ofmeasurements the background was subtracted, and the percent of totalinput counts remaining on the filters calculated. From each set of datapoints, a best-fit theoretical bimolecular binding curve was generatedusing a version of a published program (Caceci and Cacheris, 1984 supra)modified to construct a curve described by the equation,

σ=A[gp43]/(Kd+[gp43])

[0171] where σ is the fraction of the total RNA that is bound to thefilter, A is the percent of RNA at which binding saturates(approximately 60% for this protein-RNA interaction), [gp43] is theinput gp43 concentration, and Kd is the dissociation constant for thebimolecular reaction. This equation is an algebraic rearrangement ofequation [1-5] from Bisswanger (1979) Theorie und Methoden derEnzymkinetik, Verlag Chemie, Weinheim, FRG, p. 9 with the simplifyingassumption that the concentration of the protein far exceeds theconcentration of RNA-protein complexes, an assumption which is valid inthe experiments described.

Example 1 Selection of RNA Inhibitors of T4 DNA Polymerase

[0172] A 110 base single-stranded DNA template for in vitrotranscription was created as shown in FIG. 2 by ligation of threesynthetic oligonucleotides (Tables 1, 3, 4 and 5) in the presence of twocapping oligonucleotides (Tables 1 and 2). One of the template-creatingoligos was also used as the 3′ primer in reverse transcription of the invitro transcript and subsequent amplification in polymerase chainreactions (PCRs) (Innis et al. (1988) Proc. Natl. Acad. Sci. USA85:9436-9440). One of the capping oligos (1) contains the informationrequired for T7 RNA polymerase transcriptional initiation and sufficientsequence complementarily to the cDNA of the in vitro transcript to serveas the 5′ primer in the PCR amplification steps. The DNA templateencoded an RNA which contains the entire RNA recognition site for T4 DNApolymerase except that a completely random sequence was substituted inplace of the sequence which would encode the wild-type loop sequenceAAUAACUC. The random sequence was introduced by conventional chemicalsynthesis using a commercial DNA synthesizer (Applied Biosystems) exceptthat all four dNTP's were present in equimolar amounts in the reactionmixture for each position indicated by N in the sequence ofoligonucleotide number 4 (Table 1). The random sequence is flanked byprimer annealing sequence information for the 5′ and 3′ oligos used inPCR. The DNA template is thus a mixture of all loop sequence variants,theoretically containing 65,536 individual species. The dissociationconstant for the wild-type loop variant RNA sequence is about 5×10⁻⁹ Mand for the population of sequences was measured to be about 2.5×10⁻⁷ M,a 50-fold lower binding affinity. TABLE 1 1)5′-TAATACGACTCACTATAGGGAGCCAACACCACAATTCCAATCAAG-3′ (SEQ ID NO:4) 2)5′-GGGCTATAAACTAAGGAATATCTATGAAAG-3′ (SEQ ID NO:5) 3)5′-GAATTGTGGTGTTGGCTCCCTATAGTGAGTCGTATTA-3′ (SEQ ID NO:6) 4)5′-ATATTCCTTAGTTTATAGCCCNNNNNNNNAGGCTCTTGATTG-3′ (SEQ ID NO:7) and 5)5′-GTTTCAATAGAGATATAAAATTCTTTCATAG-3′ (SEQ ID NO:8)

[0173] In vitro transcripts containing the loop sequence variants weremixed with purified gp43 at three different RNA-protein ratiosthroughout the multiple rounds of selection. (For A and B theconcentration of gp43 was 3×10⁻⁸ M, “low protein,” and for C theconcentration of gp43 was 3×10⁻⁷ M, “high protein.” For A theconcentration of RNA was about 3×10⁻⁷, “low RNA,” and for B and C theconcentration of RNA was about 3×10⁻⁵ M, “high RNA.”)

[0174] One round consisted of the following steps:

[0175] 1) Selection. The RNA and protein were mixed in the desiredratios described above, incubated at 37° C., washed through anitrocellulose filter, and RNA was eluted from the filters as describedsupra.

[0176] 2) Amplification. The RNA eluted from filters was extended withAMV reverse transcriptase in the presence of 50 picomoles of 3′ primerin a 50 μl reaction under conditions described in Gauss et al. (1987)supra. To the resulting cDNA synthesis 50 picomoles of 5′ primer wasadded and in a reaction volume of 100 μl and was amplified with Taq DNApolymerase as described in Innis (1988) supra for 30 cycles.

[0177] 3) Transcription. In vitro transcription is performed on theselected amplified templates as described in Milligan et al. (1987)supra, after which DNaseI is added to remove the DNA template. Theresultant selected RNA transcripts were then used in step 1 of the nextround. Only one-twentieth of the products created at each step of thecycle were used in the subsequent cycles so that the history of theselection could be traced. The progress of the selection method wasmonitored by filter binding assays of labeled transcripts from each PCRreaction. After the fourth round of selection and amplification, thelabeled selected RNA products produced binding to gp43 equivalent tothat of wild-type control RNA. The RNA products from each round for oneexperiment (B) and from the fourth round for all three experiments weregel-purified and sequenced. In FIG. 3, we show the sequence of thepurified in vitro transcripts derived from the second, third and fourthrounds of selection and amplification for experiment B. It is clear thatthere was no apparent loop sequence bias introduced until after thethird selection. By this point in the selection, there was a detectablebias which was complete by the fourth round for the apparent consensussequence A(a/g)(u/c)AAC(u/c)(u/c). Batch sequencing of the RNAtranscribed after the fourth selection and amplification for trials A,B, and C is shown in FIG. 4. All three independent runs with differentprotein/RNA ratios gave similar results. There is some apparent bias forwild-type sequence at each of the four “variable” positions inexperiments A and C.

[0178] In order to find out what allowable combinations actuallyexisted, we used two “cloning” oligonucleotides which containedrestriction site information, to amplify sequences from RNA from thefourth round of experiment B from which individuals were cloned intopUC18 as described (Sambrook et al. (1989) supra; Innis et al. (1988)supra). The selected batches of trial B were chosen for furtherexamination because there appeared to be an even distribution of the twoallowable nucleotides which composed each of the four “variable”positions. Twenty individual clones that were identified by colonyfilter hybridization to the 3′ primer were sequenced. None of theseindividuals were mutant at any place in the operator sequence outside ofthe loop sequence positions that were deliberately varied. The sequencedistributions are summed up in FIG. 7. Surprisingly, the selected RNAmixture was actually composed of two major loop sequences. One was thewild-type sequence, AAUAACUC of which 9 out of 20 were isolated. Theother, AGCAACCU, was mutant at four positions and existed in 8 of the 20clones (see FIG. 7). The other three loop sequences detected were singlemutations of these two major sequences. Filter binding experiments withlabeled in vitro transcripts derived from each of these clonal isolatesindicated that there was a rough correlation between binding affinity ofan RNA for gp43 and selected abundance (see FIG. 7).

Example 2 Isolation of a Specific RNA Ligand for HIV ReverseTranscriptase

[0179] The reverse transcriptase activity of HIV-1 is composed of aheterodimer of two subunits (p51 and p66) that have common aminotermini. The extra carboxyterminal region of the larger peptidecomprises the RNaseH domain of reverse transcriptase; the structure ofthat domain has recently been determined at high resolution.

[0180] It has been previously shown that this HIV-1 reversetranscriptase directly and specifically interacts with its cognateprimer tRNA^(Lys3) to which it was experimentally cross-linked at theanti-codon loop and stem. It was also found that only the heterodimerexhibited this specific RNA recognition; neither homodimeric species ofreverse transcriptase bound with specificity to this tRNA.

[0181] Two template populations (with approximately 10¹⁴ differentsequences each) were created for use in SELEX by ligation. One templatepopulation was randomized over 32 nucleotide positions, using fixedsequences at the ends of the randomized region to afford cDNA synthesisand PCR amplification. The second template population had, as additionalfixed sequence at the 5′ end of the RNA, the anticodon loop and stem oftRNA^(LYS3). (All oligos used in this work are shown in Table 2). Therewas no difference in the affinity of the two randomized populations forHIV-1 reverse transcriptase [RT] (and, as is shown, the RNAs which wereselected did not utilize either 5′ region in specific binding). Ninerounds of SELEX with each population were performed using theheterodimer HIV-RT as the target protein.

[0182] The mechanism by which the randomized DNA was prepared utilizingligations and bridging oligonucleotides was described previously. Suchmethodology can diminish the total number of different sequences in thestarting population from the theoretical limit imposed by DNA synthesisat the 1 micromole scale.

[0183] In these ligation reactions about 1 nanomole of eacholigonucleotide was used. The ligated product was gel-purified with anapproximate yield of 50%. This purified template was transcribed with T7RNA polymerase as described above. It was found that HIV RT couldsaturably bind this random population with a half-maximal bindingoccuring at about 7×10⁻⁷ M as determined by nitrocellulose assays. AllRNA-protein binding reactions were done in a binding buffer of 200 mMKOAc, 50 mM Tris-HCl pH 7.7, 10 mM dithiothreitol. RNA and proteindilutions were mixed and stored on ice for 30 minutes then transferredto 37° C. for 5 minutes. (In binding assays the reaction volume is 60 μlof which 50 μl is assayed; in SELEX rounds the reaction volume is 100μl). Each reaction is suctioned through a prewet (with binding buffer)nitrocellulose filter and rinsed with 3 mls of binding buffer afterwhich it is dried and counted for assays or subjected to elution as partof the SELEX protocol. Nine rounds were performed. The RNA concentrationfor all nine rounds was approximately 3×10⁻⁵ M. HIV-RT was 2×10⁻⁸ M inthe first selection and 1×10⁻⁸ M in selections 2-9.

[0184] The experiment using RNA containing the tRNA^(LYS3) anticodonloop and stem was completed first. Nitrocellulose filter binding assaysperformed at the ninth round revealed that the RNA population hadincreased about 100-fold in affinity to HIV-1 RT when compared to thestarting candidate mixture, but that the background binding tonitrocellulose filters in the absence of protein had increased fromabout 2% of input RNA to 15%. Individual sequences were cloned from thispopulation (after filtration through nitrocellulose filters to deletesome of the high background of potential sequences selected forretention by filters alone) and are listed in Table 3. Nitrocellulosefilter binding assays of selected sequences' affinity for HIV RT areshown in FIG. 14. Some of the sequences were selected as ligands forHIV-RT, exemplified by the binding curves of ligands 1.1 and 1.3a, andshow some sequence homology as illustrated by Tables 4 and 5. Some ofthe ligand sequences exhibit significant retention on nitrocellulosefilters in the absence of protein, exemplified by ligand 1.4 (FIG. 14),and seem to be characterized by a long helix with a loop of purinerepeat elements (as shown in Table 4). In spite of our minimal, lateefforts to delete them in this experiment prior to cloning, thesesequences represented a significant part of those collected from thisexperiment.

[0185] As a consequence, experiment 2 (which has a different 5′ fixedsequence) was pre-filtered through nitrocellulose before the first,third, sixth and ninth rounds of selection. The sequences collected fromthis experiment are shown in Table 6. There are again many sequenceswith homology to those of high affinity from experiment 1 as shown inTables 4 and 5. There are many fewer, if any, sequences that fit themotif of sequences retained by nitrocellulose filters alone.Nitrocellulose binding assays of selected ligand sequences from thisexperiment compared to that of ligand 1.1 are shown in FIG. 15.

[0186] High affinity ligand RNAs with the most common sequence (1.1) anda similar sequence (1-3a) were further analyzed to determine theboundaries of the information required for high affinity binding toHIV-1 RT. The results of these experiments are shown in FIG. 16. Theseexperiments establish that the motif common to these sequences,UUCCGNNNNNNNNCGGGAAA (SEQ ID NO:9), are similarly positioned within therecognition domain. The sequences UUCCG and CGGGA of this motif maybase-pair to form an RNA helix with an eight base loop. In order todiscover what besides these fixed sequences may contribute to highaffinity binding to HIV-1 RT, a candidate mixture template was createdthat contained random incorporation at the nucleotide positions thatdiffer from these two sequences as shown in Table 7. After eight roundsof SELEX, individual sequences were cloned and sequenced. The 46sequences are shown in Table 7. Inspection of these sequences revealsextensive base-pairing between the central 8n variable region and thedownstream 4n variable region and flanking sequences; base-pairing whichin combination with that discussed above would indicate an RNApseudoknot. That no specific sequences predominate in this evolvedpopulation suggests that there is no selection at the primary sequencelevel and that selection occurs purely on the basis of secondarystructure, that is, there are many sequence combinations that givesimilar affinities for HIV-1 RT, and none have competitive advantage.Analysis of the first and second SELEX experiments reveals that theindividual sequences which comprise those populations that have homologyto the UUCCG . . . CGGGANAA motif also show a strong potential for thispseudoknot base-pairing.

[0187]FIG. 31 shows a schematic diagram of what is referred to herein asa pseudoknot. A pseudoknot is comprised of two helical sections andthree loop sections. Not all pseudoknots contain all three loops. Forthe purposes of interpreting the data obtained, the various sections ofthe pseudoknot have been labeled as shown in FIG. 31. For example, inTable 5 several of the sequences obtained in experiments one and two arelisted according to the pseudoknot configuration assumed by the varioussequences.

[0188] The results of experiments one and two, as defined in Table 5,led to experiment three wherein sequences in S1(a), S1(b) and L3 werefixed. Again, the SELEX derived nucleic acids were configured almostexclusively in pseudoknots. Examination of the results in each of theexperiments reveals that the nucleic acid solution to HIV-RT contains arelatively large number of members, the most basic common denominatorbeing that they are all configured as pseudoknots. Other generalizationsdefining the nucleic acid solution for HIV-RT are as follows:

[0189] 1) S1(a) often comprises the sequence 5′-UUCCG-3′ and S1(b) oftencomprises the sequence 5′-CGGGA-3′. However, base pair flips areallowed, and the stem may be shortened.

[0190] 2) L1 may be short or long, but often comprises two nucleotidesin the best binding nucleic acids. The 5′ nucleotide in L1 often iseither a U or an A.

[0191] 3) S2 is usually comprised of 5 or 6 base pairs, and appears tobe sequence independent. This stem may contain non-Watson/Crick pairs.

[0192] 4) L2 may be comprised of no-nucleotides, but when it exists, thenucleotides are preferably A's.

[0193] 5) L3 is generally 3 or more nucleotides, enriched in A.

[0194] 6) In most sequences obtained by SELEX, the total number ofnucleotides in L1, S2(a) and L2 equals 8.

[0195] A primary purpose of this experiment was to find ligand solutionsto HIV-1 RT. The ability of the evolved ligand clone 1.1 was compared tothe ability of the starting population for experiment 1 to inhibitreverse transcriptase activity, and is shown in FIG. 17. Even at equalconcentrations of inhibitor RNA to RT, the reverse transcriptase issignificantly inhibited by ligand 1.1. In contrast, only at 10 mM (or200-fold excess) starting population RNA is there any significantinhibition of the HIV-1 RT. Thus, the high affinity ligand to HIV-1 RTeither blocks or directly interacts with the catalytic site of theenzyme.

[0196] In order to test the specifity of this inhibition, variousconcentrations of ligand 1.1 were assayed for inhibition of MMLV, AMVand HIV-1 reverse transcriptase. The results of that experiment whichare shown in FIG. 18 show that the inhibition of ligand 1.1 is specificto HIV-1 reverse transcriptase.

Example 3 Isolation of Specific RNA Ligand for Bacteriophage R17 CoatProtein

[0197] SELEX was performed on the bacteriophage R17 coat protein. Theprotein was purified as described by Carey et al., Biochemistry, 22,2601 (1983). The binding buffer was 100 mM potassium acetate plus 10 mMdithiothreitol plus 50 mM Tris-acetate pH 7.5. Protein and RNA wereincubated together for three minutes at 37° C. and then filtered onnitrocellulose filters to separate protein-bound RNA from free RNA. Thefilters were washed with 50 mM Tris-acetate pH 7.5. Protein was at 1.2×

[0198] 10⁻⁷M for the first four rounds of SELEX and at 4×10⁻⁸ for roundsfive through 11.

[0199] The starting RNA was transcribed from DNA as describedpreviously. The DNA sequence includes a bacteriophage T7 RNA polymerasepromoter sequence that allows RNA to be synthesized according tostandard techniques. cDNA synthesis during the amplification portion ofthe SELEX cycle is primed by a DNA of the sequence: cDNA primer (PCRprimer 1): 5′GTTTCAATAGAGATATAAAATTCTTTCATAG 3′ (SEQ ID NO:10)

[0200] The DNA primers used to amplify the cDNA was, thus, the sequenceincluding the T7 promoter, 32 randomized positions, an AT dinucleotide,and the fixed sequence complementary to PCR primer 1. The RNA that isused to begin the first cycle of SELEX thus has the sequence:pppGGGAGCCAACACCACAAUUCCAAUCAAG-32N-AUCUAUGAAAGAAUUUUAUCUCUAUUGAAAC (SEQID NO:11)

[0201] A set of clones from after the 11th round of SELEX was obtainedand sequenced. Within the 38 different sequences obtained in the 47clones were three found more than once: one sequence was found sixtimes, one sequence four times, and another two times. The remaining 35sequences were found once each. Two sequences were not similar to theothers with respect to primary sequences or likely secondary structures,and were not analyzed further. Thirty-six sequences had in common thesequence ANCA situated as a tetranucleotide loop of a bulged hairpin;the bulged nucleotide was an adenine in all 36 cases. The sequences ofthe entire set are given in Table 8, aligned by the four nucleotides ofthe hairpin loop. The two nucleotides 3′ to the randomized portion ofthe starting RNA (an AU) are free to change or be deleted since the cDNAprimer does not include the complementary two nucleotides; many cloneshave changed one or both of those nucleotides.

[0202] The winning RNA motif, shown in FIG. 19, bears a directrelationship to the coat binding site identified earlier throughsite-directed mutagenesis and binding studies.

[0203] See, Uhlenbeck et al. supra (1983); Ramaniuk et al. supra (1987).However, some of the sequences are more conserved in this set than mighthave been expected. The loop sequence AUCA predominates, while earlierbinding data might have suggested that ANCA sequences are allequivalent. The natural binding site on the R17 genome includes thesequence and structure shown-below:   UU A    A   GC   GC A   GC

[0204] The natural structure includes the sequence GGAG, which serves tofacilitate ribosome binding and initiation of translation of the R17replicase coding region. During SELEX that requirement is not present,and the winning sequences contain around the loop and bulge C:G basepairs more often than G:C base pairs. SELEX, therfore, relaxes theconstraints of biology and evolutionary history, leading to ligands withhigher affinities than the natural ligand. Similarly, the loop cytidinefound in each of the 36 sequences is a uridine in the natural site, andit is known that C provides higher affinity than U. During evolutionnatural sites must have an appropriate affinity rather than the highestaffinity, since the tightest binding may lead to disadvantages for theorganism.

Example 4 Isolation of a Nucleic Acid Ligand for a Serine Protease

[0205] Serine proteases are protein enzymes that cleave peptide bondswithin proteins. The serine proteases are members of a gene family inmammals, and are important enzymes in the life of mammals. Serineproteases are not known to bind to nucleic acids. Examples of serineproteases are tissue plasminogen activator, trypsin, elastase,chymotrypsin, thrombin, and plasmin. Many disease states can be treatedwith nucleic acid ligands that bind to serine proteases, for example,disorders of blood clotting and thrombus formation. Proteases other thanserine proteases are also important in mammalian biology, and these toowould be targets for nucleic acid ligands with appropriate affinitiesobtained according to the invention herein taught.

[0206] Human tissue plasminogen activator (htPA), available fromcommercial sources, was chosen as a serine protease to place through theSELEX method of this invention.

[0207] The RNA candidate mixture used was identical to that described inExample 11 below in the HSV DNA polymerase experiment. Binding duringSELEX was in 50 mM NaCl plus 50 mM Tris-acetate pH 7.5 for 3 minutes at37 degrees. SELEX was carried out for ten rounds. The 30N candidatemixture bound to tPA with an affinity (kd) of 7×10(−8) M in 150 mM NaAcplus 50 mM Tris-acetate pH 7.5; the affinity of the RNA present afternine rounds of SELEX was about threefold tighter. Nine clones wereisolated, sequenced, and some of these were tested for binding to tPA aspure RNAs. The sequences of the nine clones obtained at low salt were asfollows: Name # Sequence of random region SEQ ID NO: A1 3ACGAAACAAAUAAGGAGGAGGAGGGAUUGU 12 A2 1 AGGAGGAGGAGGGAGAGCGCAAAUGAGAUU 13A3 1 AGGAGGAGGAGGUAGAGCAUGUAUUAAGAG 14 B 1UAAGCAAGAAUCUACGAUAAAUACGUGAAC 15 C 1 AGUGAAAGACGACAACGAAAAACGACCACA 16D 1 CCGAGCAUGAGCCUAGUAAGUGGUGGAUA 17 E 1 UAAUAAGAGAUACGACAGAAUACGACAUAA18

[0208] All tested sequences bound at least somewhat better than thestarting 30N candidate mixture. However, the A series bound tonitocellulose better in the absence of tPA than did the candidatemixture, as though the shared sequence motif caused retention on thenitrocellulose matrix by itself. That motif is underlined in thesequences shown above. In other SELEX experiments AGG repeats have beenisolated when trying to identify a ligand solution to HIV-1 reversetranscriptase, the human growth hormone receptor extracellular domain,and even the R17 coat protein in a first walking experiment. Whentested, these sequences show modest or substantial binding tonitrocellulose filters without the target protein being present. Itappears that the AGG repeats may be found in hairpin loops. Since SELEXis an iterative process in most embodiments, it is not surprising thatsuch binding motifs would emerge.

[0209] The existence of nitrocellulose binding motifs may be avoided byone or more of several obvious strategies. RNA may be filtered throughthe nitrocellulose filters prior to SELEX to eliminate such motifs.Alternative matrices may be used in alternative rounds of SELEX, e.g.,glass fiber filters. Alternative partitioning systems may be used, e.g.,columns, sucrose gradients, etc. It is obvious that any given singleprocess will lead to biases in the iterative process that will favormotifs that do not have increased binding to the target, but areselected by the selection process. It is, therefore, important to usealternating processes or screening processes to eliminate these motifs.It has been shown that the AGG repeats, like other motifs isolated asbiases that are target independent, will tend to emerge most frequentlywhen the affinity of the best sequences for the target are rather low orwhen the affinities of the best sequences are only slightly better thanthe affinity of the starting candidate mixture for the target.

Example 5 Isolation of a Nucleic Acid Ligand for a Mammalian Receptor

[0210] Mammalian receptors often are proteins that reside within thecytoplasmic membranes of cells and respond to molecules circulatingoutside of those cells. Most receptors are not known to bind to nucleicacids. The human growth hormone receptor responds to circulating humangrowth hormone, while the insulin receptor responds to circulatinginsulin. Receptors often have a globular portion of the molecule on theextracellular side of the membrane, and said globular portionspecifically binds to the hormone (which is the natural ligand). Manydisease states can be treated with nucleic acid ligands that bind toreceptors.

[0211] Ligands that bind to a soluble globular domain of the humangrowth hormone receptor (shGHR) are identified and purified using thecandidate mixture of Example 4. Again, the binding buffers are free ofDTT. The soluble globular domain of the human growth hormone receptor isavailable from commercial and academic sources, having usually beencreated through recombinant DNA technology applied to the entire geneencoding a membrane-bound receptor protein. SELEX is used reiterativelyuntil ligands are found. The ligands are cloned and sequenced, andbinding affinities for the soluble receptor are measured.

[0212] Binding affinities are measured for the same ligand for othersoluble receptors in order to ascertain specifity, even though mostreceptors do not show strong protein homologies with the extracellulardomains of other receptors. The ligands are used to measure inhibitionof the normal binding activity of shGHR by measuring competitive bindingbetween the nucleic acid ligand and the natural (hormone) ligand.

Example 6 Isolation of a Nucleic Acid Ligand for a Mammalian Hormone orFactor

[0213] Mammalian hormones or factors are proteins, e.g., growth hormone,or small molecules (e.g., epinephrine, thyroid hormone) that circulatewithin the animal, exerting their effects by combining with receptorsthat reside within the cytoplasmic membranes of cells. For example, thehuman growth hormone stimulates cells by first interacting with thehuman growth hormone receptor, while insulin stimulates cells by firstinteracting with the insulin receptor. Many growth factors, e.g.,granulocyte colony stimulating factor (GCSF), including some that arecell-type specific, first interact with receptors on the target cells.Hormones and factors, then, are natural ligands for some receptors.Hormones and factors are not known, usually, to bind to nucleic acids.Many disease states, for example, hyperthyroidism, chronic hypoglycemia,can be treated with nucleic acid ligands that bind to hormones orfactors.

[0214] Ligands that bind to human insulin are identified and purifiedusing the starting material of Example 3. Human insulin is availablefrom commercial sources, having usually been created through recombinantDNA technology. SELEX is used reiteratively until a ligand is found. Theligands are cloned and sequenced, and the binding affinities for humaninsulin are measured. Binding affinities are measured for the sameligand for other hormones or factors in order to ascertain specificity,even though most hormones and factors do not show strong proteinhomologies with human insulin. However, some hormone and factor genefamilies exist, including a small family of IGF, or insulin-like growthfactors. The nucleic acid ligands are used to measure inhibition of thenormal binding activity of human insulin to its receptor by measuringcompetitive binding with the insulin receptor and the nucleic acidligand in the presence or absence of human insulin, the natural ligand.

Example 7 Preparation of Column Matrix for SELEX

[0215] Following the procedures as described in Example 9 below, it wasshown that the polypeptide bradykinin is not retained by nitrocellulose.To enable the SELEX process on bradykinin, the protein was attached toActivated CH Sepharose 4B (Pharmacia LKB) as a support matrix accordingto standard procedures. The resulting matrix was determined to be 2.0 mMbradykinin by ninhydrin assay. See Crestfield et al. J. Biol. Chem. vol.238, pp. 238, pp. 622-627 (1963); Rosen Arch. Biochem. Biophys., vol.67, pp. 10-15 (1957). The activated groups remaining on the supportmatrix were blocked with Tris. See Pharmacia, Affinity Chromatography:Principles and Methods, Ljungforetagen AB, Uppsala, Sweden (1988).

[0216] Spin-column separation was used to contact solutions of candidatemixtures with beaded matrix. In a general procedure for performing aselection step for SELEX, 40 μL of a 50:50 slurry of target sepharose inreaction buffer is transferred to a 0.5 ml Eppendorf tube. The RNAcandidate mixture is added with 60 μL of reaction buffer, the reactionmixture is allowed to equilibrate for 30 minutes at 37° C. A hole ispierced in the bottom of the tube, and the tube is placed inside alarger Eppendorf tube, both caps removed, and the tubes spun (1000 RPM,10″, 21° C.) to separate the eluate. The small tube is then transferredto a new larger tube, and the contents washed four times by layeringwith 50 μL of the selected wash buffer and spinning. To conduct bindingassays, the tube containing the radioactive RNA is transferred to a newEppendorf tube and spun to dryness.

[0217] A bulk binding experiment was performed wherein a RNA candidatemixture comprised of a 30 nucleic acid randomized segment was applied tothe bradykinin sepharose matrix. Using the spin-column technique, thebinding of the bulk 30N RNA to various matrices was determined underhigh salt concentrations to determine the best conditions for minimizingbackground binding to the sepharose. Background binding of RNA tosepharose was minimized by blocking activiated groups on the sepharosewith Tris, and using a binding buffer of 10 mM DEM and 10-20 mM KOAc. Atthis buffer condition, a binding curve of the randomized bulk solutionof RNA yielded a bulk Kd of about 1.0×10⁻⁵.

[0218] See FIG. 20. The curve was determined by diluting the bradykininsepharose against blocked, activated sepharose.

Example 8 Preparation of Candidate Mixtures Enhanced in RNA MotifStructures

[0219] In the preferred embodiment, the candidate mixture to be used inSELEX is comprised of a contiguous region of between 20 and 50randomized nucleic acids. The randomized segment is flanked by fixedsequences that enable the amplification of the selected nucleic acids.

[0220] In an alternate embodiment, the candidate mixtures are created toenhance the percentage of nucleic acids in the candidate mixturepossessing given nucleic acid motifs. Although two specific examples aregiven here, this invention is not so limited. One skilled in the artwould be capable of creating equivalent candidate mixtures to achievethe same general result.

[0221] In one specific-example, shown as Sequence A in FIG. 21, thecandidate mixture is prepared so that most of the nucleic acids in thecandidate mixture will be biased to form a helical region of between 4and 8 base pairs, and a “loop” of either 20 or 21 contiguous randomizedsequences. Both 5′ and 3′ ends of the sequence mixture will containfixed sequences that are essential for the amplification of the nucleicacids. Adjacent these functional fixed sequences will be fixed sequenceschosen to base pair with fixed sequences on the alternate side of therandomized region. Going from the 5′ to the 3′ end of the sequences,there will be 5 distinct regions: 1) fixed sequences for amplification;2) fixed sequences for forming a helical structure; 3) 20 or 21randomized nucleic acid residues; 4) fixed sequences for forming ahelical structure with the region 2 sequences; and 5) fixed sequencesfor amplification. The A candidate mixture of FIG. 21 will be enrichedin hairpin loop and symmetric and asymmetric bulged motifs. In apreferred embodiment, the candidate mixture would contain equal amountsof sequences where the randomized region is 20 and 21 bases long.

[0222] A second example, shown in FIG. 21 as sequence B, is designed toenrich the candidate mixture in nucleic acids held in the psuedoknotmotif. In this candidate mixture, the fixed amplification sequencesflank three regions of 12 randomized positions. The three randomizedregions are separated by two fixed regions of four nucleotides, thefixed sequences selected to preferably form a four basepair helicalstructure. Going from the 5′ to the 3′ end of the sequence, there willbe 7 district regions: 1) fixed sequences for amplification; 2) 12randomized nucleotides; 3) fixed sequences for forming a helicalstructure; 4) 12 randomized nucleotides; 5) fixed sequences for forminga helical structure with the region 3 nucleotides; 6) 12 randomizednucleotides; and 7) fixed sequences for amplification.

[0223] In a preferred candidate mixture, the engineered helical regionsare designed to yield alternating GC, CG, GC, CG basepairs. Thisbasepair motif has been shown 10 to give a particularly stable helicalstructure.

Example 9 Bulk Binding of Randomized RNA Sequences to Proteins not Knownto Bind Nucleic Acids

[0224] Following the general nitrocellulose selection procedures asdescribed in Example 1 above for SELEX, a group of randomly selectedproteins were tested to determine if they showed any affinity to a bulkcandidate mixture of RNA sequences. The candidate mixture utilized ineach experiment consisted of a 40N RNA solution (a randomized mixturehaving a 40 randomized nucleic acid segment) that was radiolabled todetect the percentage of binding. The candidate mixture was diluted inbinding buffer (200 mM KoAc, 50 mM TrisoAc pH 7.7, 10 mM DTT) and 30 μLwas used in a 60 μL binding reaction. To each reaction was added 20 μL,10 μL or 1 μL of each protein. Binding buffer was added to reach a totalvolume of 60 μL. The reactions were incubated at 37° C. for 5 minutesand then subjected to filter binding.

[0225] The proteins tested were Acetylcholinesterase (MW 230,000);N-acetyl-β-D-glucosamimidase (MW 180,000); Actin (MW 43,000); AlcoholDehydrogenase (240,000); Aldehyde Dehydrogenase (MW 200,000);Angiotensin (MW 1297); Ascorbate Oxidase (MW 140,000); AtrialNutriuretic Factor (MW 3,064); and Bombesin (MW 1621). The proteins werepurchased from Boehringer Ingelheim, and were utilized in the buffercomposition in which they are sold.

[0226] The RNA candidate mixture used in each experiment contained10,726 counts of radiolabel, and a background binding of about 72 countswas found. The results are summarized in Table 9. All proteins testedexcept Acetylcholinesterase, N-acetyl-β-D-glucosamimidase and Actin werefound to yield some bulk RNA affinity. Because of the low concentrationof N-acetyl-β-D-glucosamimidase in solution as purchased, the resultsfor that protein are not definitive. In addition, if any of the proteinstested do not bind to nitrocellulose—which is the case for bradykinin—noaffinity would be detected in this experiment. Example 7 abovediscussing column supported bradykinin demonstrates that the failure toshow bulk binding in this experiment does not mean that bulk bindingdoes not exist for a given protein.

Example 10 Isolation of RNA Ligand Solution for Nerve Growth Factor

[0227] Nerve growth factor (NGF) is a protein factor that acts through areceptor on the outside surfaces of target cells. Antagonists towardgrowth factors and other hormones can act by blocking a receptor or bytitrating the factor or hormone. An RNA was sought by the SELEX processthat binds directly to NGF.

[0228] The starting RNAs were prepared exactly as in the case of HSV DNApolymerase (Example 11).

[0229] Two different experiments were done with NGF. The first was a tenround SELEX using low salt binding buffer, 3 minutes at 37 degreesincubation, and then filtration and a wash with the same buffer duringthe SELEX. The low salt binding buffer was 50 mM NaCl plus 50 mMTris-acetate pH 7.5. The second experiment used as the binding buffer200 mM NaCl plus 50 mM Tris-acetate pH 7.5, and then after filtration awash with 50 MM Tris-acetate pH 7.5; this SELEX experiment went throughonly seven rounds.

[0230] The low salt experiment yielded 36 cloned sequences. Fifteen ofthe clones were nearly identical—#'s 2, 3, 4, 5, 6, 8, 11, 13, 19, 22,28, 33, and 34 were identical, while #'s 15 and 25 had a singledifference: ACAUCGAUGACCGGAAUGCCGCACACAGAG (SEQ ID NO:19) +A G (15) (25)

[0231] A second abundant sequence, found six times, was:CCUCAGAGCGCAAGAGUCGAACGAAUACAG (SEQ ID NO:20) (#'s 12, 20, 27, and 31)            G    C             (21) (1)

[0232] From the high salt SELEX ten clones have been sequenced, buteight of them are identical and obviously related to the abundant (butminor) second class from the low salt experiment. The winning sequenceis:

[0233] - - - CUCAUGGAGCGCAAGACGAAUAGCUACAUA - - - (SEQ ID NO:21)

[0234] Between the two experiments a total of 14 different sequenceswere obtained (sequences with one difference are lumped together in thisanalysis); they are listed here, with the similarities overmarked andthe frequencies noted. ngf.a through ngf.k are from the low saltexperiment, while hsngf.a through hsngf.c are from the high saltexperiment: SEQ Fre- ID quency NOS:     xxxxxxxxxxx   ####### ngf.aACAUCGAUGACCGGAAUGCCGCACACAGAG 15/36  22     xxxxxxxxxxx  ###### ngf.bCCUCAGAGCGCAAGAGUCGAACGAAUACAG 6/36 23  $$$$$$$$$$$$$$  $$$$    $$$$    #######    xxxxxxxxxxx ngf.c CGGGUGAUUAGUACUGCAGAGCGGAAUCAC 5/36 24  #######    xxxxxxxxxxx ngf.d UGCGAUAAGACUUGCUGGGCGAGACAAACA 3/36 25#######           xxxxxxxxxxx ngf.e AGUCCACAGGGCACUCCCAAAGGGCAAACA 1/3626       xxxxxxxxxxx####### ngf.f ACUCAUGGAGCGCUCGACGAUCACCAUCGA 1/36 27xxxxxxxx           ####### ngf.g CAAGGGAGAAUGUCUGUAGCGUCCAGGUA 1/36 28xxxxxxxxxxx  ####### ngf.h CGACGCAGAGAUACGAAUACGACAAAACGC 1/36 29  ######xxxxxxxxxxx ngf.i GAGGGUGAGGUGGGAGGCAGUGGCAGUUUA 1/36 30           xxxxxxxxxxx####### ngf.j UGAACUACGGGGGAGGGAGGGUGGAAGACAG 1/3631           #######xxxxxxxxxxx ngf.k GUGGAGGGGACGUGGAGGGGAGAGCAAGA 1/3632      xxxxxxxxxxx####### hsngf.a CUCAUGGAGCGCAAGACGAAUAGCUACAUA 8/1033 $$$$  $$$$$$$$$$$$$$    $$$$     xxxxxxxxxxx    ####### hsngf.bCCAUAGAGGCCACAAGCAAAGACUACGCA 1/10 34  #######   xxxxxxxxxxx hsngf.cCCUACAAGAAAAGAGGGAAGGAGAAAAAAA 1/10 35

[0235] While no obvious secondary structure is embedded within thesimilar sequences, it is likely that the winning sequences placecritical nucleotides into a structure that is well fit by an NGF bindingsite.

[0236] A binding assay of nucleic acid hsngf.a to NGF was performed, andthis nucleic acid was found to have a Kd of about 20 to 30 fold higherthan the bulk 30N candidate mixture. The same nucleic acid was alsofound to have a lower or equal affinity to R17 coat protein and tPA thana 30N candidate mixture. Thus, the SELEX derived nucleic acid ligandhsngf.a is a selective ligand to NGF.

Example 11 Isolation of a Nucleic Acid Ligand for HSV-1 DNA Polymerase

[0237] Herpes simplex virus (HSV-1) is a DNA-containing virus ofmammals. HSV-1, like many DNA-containing viruses, encodes its own DNApolymerase. The HSV-1 DNA polymerase has been purified in two forms,which have different qualities but each of which will catalyze DNAreplication in vitro. The simple form, which is one polypeptide, ispurified from cells expressing the cloned gene according to Hernandez,T. R. and Lehman, I. R., J. Biol. Chem., 265, 11227-11232 (1990). Thesecond form of DNA polymerase, a heterodimer, is purified from HSV-1infected cells according to Crute, J. J. and Lehman, I. R., J. Biol.Chem., 264, 19266-19270 (1989); the heterodimer contains one peptidecorresponding to the polymerase itself and another, UL42, also encodedby HSV-1.

[0238] SELEX was performed on both the single polypeptide and theheterodimer. The binding buffer in each case was 50 mM potassium acetateplus 50 mM Tris acetate, pH 7.5, and 1 mM dithiothreitol. Filtration toseparate bound RNA was done after four minutes of incubation at 37degrees; the filters were washed with binding buffer minusdithiothreitol.

[0239] The RNA candidate mixture was transcribed from DNA as describedpreviously. As is the case in other embodiments, the DNA sequenceincludes a bacteriophage T7 RNA polymerase promoter sequence that allowsRNA to be synthesized according to standard techniques. cDNA synthesisduring the amplification portion of SELEX is primed by a DNA of thesequence: cDNA primer (PCR primer 1): 5′ GCCGGATCCGGGCCTC-ATGTGAA 3′(SEQ ID NO:36)

[0240] The DNA primers used to amplify the cDNA in that portion of theSELEX cycle include, in one of them, the T7 promoter; that PCR primerhas the sequence:

[0241] PCR primer 2: 5′CCGAAGCTTAATACGACTCACTATAGGGAGCTCAGAATAAACGCTCAA3′ (SEQ ID NO:37)

[0242] The initial randomized DNA included the sequence with the T7promoter, 30 randomized positions, and the fixed sequence complementaryto PCR primer 1. The RNA that is used to begin the first cycle of SELEXthus has the sequence:pppGGGAGCUCAGAAUAAACGCUCAA-30N-UUCGACAUGAGGCCCGGAUCCGGC (SEQ ID NO:38)

[0243] SELEX was performed for seven rounds, after which cDNA wasprepared and cloned as described previously. The series of sequencesdesignated “H” were obtained with the simple HSV DNA polymerase as thetarget, while the “U” series was obtained with the heterodimericpolymerase that includes the UL42 polypeptide.

[0244] About 25% of the sequences from the H series contain an exactsequence of 12 nucleotides at the 5′ end of the randomized region (theupper case letters are from the randomized region). In some sequencesthe length between the fixed primers was not exactly 30 nucleotides, andin one case (H2) a large deletion was found within the randomizedregion. The members of this H subset include:           xxxxxxxxxxxx H5:--cgcucaaUAAGGAGGCCACGGACAACAUGGUACAGcuucgaca-- (SEQ ID NO:39) H10:--cgcucaaUAAGGAGGCCACAACAAAIGGAGACAAAuucgaca-- (SEQ ID NO:40) H4:--cgcucaaUAAGGAGGCCACACACAUAGGUAGACAUGuucgaca-- (SEQ ID NO:41) H19:--cgcucaaUAAGGAGGCCACAUACAAAAGGAUGAGUAAAuucgaca-- (SEQ ID NO:42) H20:--cgcucaaUAAGGAGGCCACAAAUGCUGGUCCACCGAGAuucgaca-- (SEQ ID NO:43) H38:--cgcucaaUAGGGAGGGCACGGGAAGGGUGAGUGGAUAuucgaca-- (SEQ ID NO:44) H2:--cgcucaaUAAGGAGGCCACAAGuucgaca-- (SEQ ID NO:45)

[0245] Two members of the U series share this primary sequence motif:U9: --cgcucaaUAAGGAGGGCCACAGAUGUAAUGGAAACuucgaca-- (SEQ ID NO:46) U13:--gcucaaUAAGGAGGCCACAUACAAAAGGAUGAGUAAAAuucgaca-- (SEQ ID NO:47)

[0246] The remaining sequences from the H and U series show no obviouscommon sequence; in addition, no sequences from the seventh roundemerged as winning single sequences in either series, suggesting thatmore rounds of SELEX will be required to find the best ligand family forinhibiting HSV DNA polymerase.

[0247] It appears that the primary sequence

[0248] - - - cgcucaaUAAGGAGGCCAC . . . (nucleotides 1-19 of SEQ ID NO.39) may be a candidate for an antagonist species, but those members ofthe series have yet to be tested as inhibitors of DNA synthesis. Itappears that the fixed sequence just 5′ to the UAAGGAGGCCAC (nucleotides8-19 of SEQ ID NO. 39) must participate in the emergence of this subset,or the shared 12 nucleotides would have been positioned variably withinthe randomized region.

Example 12 Isolation of a Nucleic Acid Ligand for E. coli RibosomalProtein S1

[0249] The E. coli 30S ribosomal protein S1 is the largest of the 21 30Sproteins. The protein has been purified based on its high affinity forpolypyrimidines, and is thought to bind rather tightly to singlestranded polynucleotides that are pyrimidine rich. It was questioned ifthe RNA identified as a ligand solution by SELEX was in any way moreinformation rich than a simple single stranded RNA rich in pyrimidines.

[0250] The RNAs, DNAs, cDNA primer (PCR primer 1), and PCR primer 2 wereidentical to those used for HSV-1 DNA polymerase (see, Example 11). Thebinding buffer contained 100 mM ammonium chloride plus 10 mM magnesiumchloride plus 2 mM dithiothreitol plus 10 mM Tris-chloride, pH 7.5.Binding was at room temperature, and complexes were once again separatedby nitrocellulose filtration. The protein was purified according to 1.Boni et al., European J. Biochem., 121, 371 (1982).

[0251] After 13 SELEX rounds, a set of 25 sequences was obtained. Morethan twenty of those sequences contained pseudoknots, and thosepseudoknots contain elements in common.

[0252] The general structure of pseudoknots can be diagramed as;

[0253] STEM 1a—LOOP 1—STEM 2a—STEM 1b—LOOP 2—STEM 2b (See FIG. 31)

[0254] Most of the S1 protein ligands contain:

[0255] STEM 1 of 4 to 5 base pairs, with a G just 5′ to LOOP 1

[0256] LOOP 1 of about 3 nucleotides, often ACA

[0257] STEM 2 of 6 to 7 base pairs, stacked directly upon STEM 1

[0258] LOOP 2 of 5 to 7 nucleotides, often ending with GGAAC

[0259] A reasonable interpretation of these data is that LOOP 2 isstretched across STEM 1 so as to hold that loop rigidly in a form thatsimplifies and enhances the binding of the single strand to the activesite of protein S1. A picture of the consensus pseudoknot in twodimensions would look like this:        |----------------R N G       ||------------Y        G        ||   |-----(C/G)        A       ||   |     |--(U A)      A        ||   |     |   |---C  C       ||   |     |   N-N′        ||   |     |   N-N′       ||   |     |   A-u        ||   |     |   A-u       ||   |     |   G-c 5′--NNNYR (G/C) (A/U) GACAC-gNNNNNNN---3′

[0260] In such figures the base pairs are shown as lines and dashes, theselections of bases from the randomized region are shown in upper caseletters, Y is a pyrimidine, R is a purine, N-N′ means any base pair, Nmeans any nucleotide, and the lower case letters are from the fixedsequence used for PCR amplifications.

[0261] It appears that single-stranded polynucleotide binding proteinsand domains within proteins will often select, during SELEX, apseudoknot which presents the extended, rigid single strand called LOOP2 to the binding site of the protein in a manner that maximizes theinteractions with that site. Thus, when the HIV-1 RT psueodoknotemerged, it is reasonable to think that the single stranded domain LOOP2 is bound within the region of RT that holds the template strand duringreplication. That is, it appears reasonable that most replicationenzymes (DNA polymerase, RNA polymerase, RNA replicases, reversetranscriptases) will have a domain for holding the template strand thatmight prefer a pseudoknot as the ligand of choice from SELEX.

Example 13 Isolation of a Nucleic Acid Ligand to HIV-1 Rev Protein

[0262] The HIV-1 rev protein's RNA-recognition site appears to becomplex, and its function is essential to the productive infection of anepidemic viral disease. See, Olsen et al., Science, vol. 247, pp.845-848 (1990). The SELEX process on this protein was performed in orderto learn more about the recognition element and to isolate a ligand tothe target protein.

[0263] A candidate mixture was created with a 32 nucleotide long randomregion as described above in Example 2. It was found that the revprotein could saturably bind the starting candidate mixture with ahalf-maximal binding occurring at about 1×10⁻⁷ as determined bynitocellulose assays. All RNA-protein binding reactions were performedin a binding buffer of 200 mM KOAc, 50 mM Tris-HCl pH 7.7, 10 mMdithiothreitol. RNA and protein dilutions were mixed and stored on icefor 30 minutes then transferred to 37 degrees for 5 minutes. (In bindingassays the reaction volume is 60 μl of which 50 μl is assayed; in SELEXrounds the reaction volume is 100 μl.) Each reaction is suctionedthrough a prewet (with binding buffer) nitrocellulose filter and rinsedwith 3 mls of binding buffer after which it is dried and counted forassays or subjected to elution as part of the SELEX protocol. Ten roundsof SELEX were performed, using a RNA concentration of about 3×10⁻⁵ M.The concentration of rev protein was 1×10⁻⁷in the first round, and2.5×10⁻⁸ in all subsequent rounds. The initial candidate mixture was runover a nitrocellulose filter to reduce the number of sequences that havea high affinity for nitrocellulose. This process was also repeated afterrounds 3, 6, and 9. The cDNA product was purified after every thirdround of selection to avoid anomalously sized species which willtypically arise with repeated rounds of SELEX. After 10 rounds thesequence in the variable region of the RNA population was nonrandom asdetermined by dideoxy-chain termination sequencing. 53 isolates werecloned and sequenced.

[0264] Each of the cloned sequences are listed in Table 10. Allsequences were analyzed by the Zucker RNA secondary structure predictionprogram. See, Zucker, Science, vol. 244, pp. 48-52 (1989); Jaeger etal., Proc. Natl. Acad. Sci. USA, vol. 86, pp. 7706-7710 (1989). On thebasis of common secondary structure all sequences have been grouped intothree common motifs as shown in Table 11. Motifs I and II are similar inconformation including a bulged loop closed at each end by a helix. Thisgeneralized structure has been illustrated schematically in Table 12,and the domains labeled for easy discussion; that is from 5′ to 3′ Stem1a (which base pairs to the 3′ Stem 1b), Loop 1, Stem 2a, Loop 3, Stem2b, Loop 2, and Stem 1b. The sequences which fit in the various domainsare listed for individual sequences in Table 12. (Note that in sequence3a, the homologous alignment is flipped 180 degrees so that it is Stem 1which is closed with a loop.) The energies of folding of the RNAmolecule (including the fixed flanking sequences) are shown in Table 13.

[0265] The wild-type rev responsive element (RRE) that has beendetermined to be at least minimally involved in binding of rev to HIV-1transcripts was also folded by this program, and is included in Tables12 and 13.

[0266] The sequences were also searched for related subsequences by aprocedure based on that described in Hertz et al. Comput. Appl. Biosci.,vol.6. pp.81-92 (1990). Two significant patterns were identified. Eachisolate was scored to identify its best match to the patterns, theresults of which can be seen in Table 13. The related subsequencesmotifs are presented by the common secondary structures in similarconformations; that is, the first sequence UUGAGAUACA (SEQ ID NO:48) iscommonly found as Loop 1 plus the 3′ terminal CA, which pairs with theUG at the 5′ end of the second information rich sequence UGGACUC(commonly Loop 3). There is also a strong prediction of base-pairing ofthe GAG of sequence I to the CUC of sequence II. Motif II is similar toMotif I in that the subsequence GAUACAG predominates as a loop oppositeCUGGACAC with a similar pairing of CA to UG. Motif II differs in thesize of the loops and some of the sequence particularly in the absenceof predicted base-pairing across the loop. One domain of the wild-typeRRE closely resembles Motif II. Motif III is the least like all theother sequences, although it is characterized by two bulged U's adjacentto base-paired GA-UC as in Motif I. Unfortunately, further comparisonsare complicated because the folding pattern of Motif III involves the 3′fixed sequence region in critical secondary structures; because thesesequences are invariant there is no way to analyze the importance of anyone of them. The folded sequences of representatives of each Motif isshown in FIG. 23 with the folded sequence of the wild-type RRE.

[0267] The sequences were further analyzed for their affinity to the revprotein. Templates were PCR'd from a number of clones from which labeledin vitro transcripts were prepared and individually assayed for theirability to bind rev protein. These binding curves are shown in FIGS. 24to 28. Labeled transcripts from oligonucleotide templates were alsosynthesized which contain the wild-type RRE discussed above, and what isinferred to be the consensus motif in a highly stable conformation. Tocontrol for experimental variations, the best binding sequence, isolate6a, was assayed as a standard in every binding experiment. TheRNA-protein mixtures were treated as described above except that dilutedRNA's were heated to 90 degrees for 1 minute and cooled on ice prior tomixing. The average Ka for isolate 6a was 8.5×10⁻⁸ M, and the results ofthis experiment are shown in Table 13.

[0268] The binding curves of FIG. 24 shows that the evolved population(P) improved approximately 30-fold for binding to rev protein relativeto the starting candidate mixture. The binding of the wild-type RREclosely resembles that of the most abundant clone, 1c. This experimentalso illustrates how sensitive the rev binding interaction is tosecondary structure. Isolates 6a and 6b are identical in the regions ofhigh information content, but are quite different at the level ofsecondary structure resulting in changes at three nucleotide positions.These changes, which predict the base-pairing of Stem 1, lower theaffinity of 6b by 24fold. Sensitivity to secondary structure anomaliesis further illustrated by the binding of isolate 17 as shown in FIG. 25.Isolate 17 has the maximum information score as shown in Table 12.However, there is an extra bulged U at the 5′ end of Loop 1 as shown inTable 11. This extra U results in isolate 17's reduced affinity for revas compared to other sequences of Motif I. In contrast, singlenucleotide deletions of Loop 2 sequences, even those that diminish theprospect of cross-bulge base-pairing are well tolerated by the revinteraction.

[0269] Another compelling commonality is the conservation of thesequence ACA opposite UGG where the CA pairs with the UG to begin Stem2. This sequence is shared by Motifs I and II as well as by thewild-type RRE. Sequences 11 and 12 exhibit a base-pair substitution atthis position (see Table 12), and sequence 12 was tested and has reducedaffinity compared to most of the other Motif I sequences.

[0270] The RNA sequences determined by SELEX to be rev ligands may beclassified by primary and secondary structure. A consensus emerges of anasymmetric bulge flanked by two helices in which are configuredspecifically conserved single and double stranded nucleotides. Althoughbase-pairing across the bulge is predicted for many of the sequencesisolated (Motif I), it may not be essential or crucial to revinteraction. Optimal sizes for Loop 1 appear to be 8 (Motif I) or 6(Motif III) where there is an observed penalty for sizes of 9 or 3.Optimal sizes for Loop 3 are 5 and 4. In addition, the interaction ofrev with the various domains of these ligands may be additive. Motif IIresembles Motif I primarily at the junction of Loops 1 and 3 at Stem 2.Motif III resembles Motif I at the junction of Loops 1 and 3 at Stem 1.Consensus diagrams of the Motif I and II nucleic acid solutions forHIV-rev are shown in FIGS. 29 and 30.

[0271] The abundance of sequences in the cloned population is notstrictly correlated with affinity to rev protein. It is possible thatthe concentration of rev protein used throughout the SELEX process wassufficient to bind a significant percentage of all these isolates. As aconsequence, there may have been selection for replicability of cDNA andDNA during PCR superimposed on a low stringency selection for binding torev. The highly structured nature of these ligands and the possibledifferences in the efficiency of cDNA synthesis on these templatesreinforces this potential replicative bias. Also, there is some mutationthat occurs during the SELEX process. The sequence 6a so resembles 6bthat they must have a common ancestor. This relatively late arrivalduring the rounds of SELEX may explain the paucity of this sequenceirrespective of its higher affinity to the target. In the same manner,some of the ligands that have emerged may have mutated relativelyrecently during selection from ancestor sequences that exist in theinitial candidate mixture but are not represented in the clonedpopulation.

[0272] The invention disclosed herein is not limited in scope to theembodiments disclosed herein. As disclosed, the invention can be appliedby those of ordinary skill in the art to a large number of nucleic acidligands and targets. Appropriate modifications, adaptations andexpedients for applying the teachings herein in individual cases can beemployed and understood by those skilled in the art, within the scope ofthe invention as disclosed and claimed herein. TABLE 1 1)5′-TAATACGACTCACTATAGGGAGCCAACACCACAATTCCAATCAAG-3′ (SEQ ID NO:4) 2)5′-GGGCTATAAACTAAGGAATATCTATGAAAG-3′ (SEQ ID NO:5) 3)5′-GAATTGTGGTGTTGGCTCCCTATAGTGAGTCGTATTA-3′ (SEQ ID NO:6) 4)5′-ATATTCCTTAGTTTATAGCCCNNNNNNNNAGGCTCTTGATTG-3′ and (SEQ ID NO:7) 5)5′-GTTTCAATAGAGATATAAAATTCTTTCATAG-3′ (SEQ ID NO:8)

[0273] TABLE 2 1a) 5′-taatacgactcactatagggagccaacaccacaattccaatcaag-3′(SEQ ID NO:49) (bridging oligo for 5′ construction and 5′PCR oligo) 1b)5′-taatacgactcactatagggagcatcagacttttaatctgacaatcaag-3′ (SEQ ID NO:50)(bridging oligo for 5′ construction and 5′PCR oligo) 2)5′-atctatgaaagaattttatatctc-3′ (SEQ ID NO:51) (bridging oligo for3′ ligation) 3a) 5′-gaattgtggtgttggctccctatagtgagtcgtatta-3′ (SEQ IDNO:52) (template construction oligo) 3b)5′-tcagattaaaagtctgatgctccctatagtgagtcgtatta-3′ (SEQ ID NO:53) (templateconstruction oligo) 4)5′-tttcatagatnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnncttgattg-3′ (SEQ ID NO:54)(template construction oligo) 5) 5′-ccggatccgtttcaatagagatataaaattc-3′(SEQ ID NO:55) (3′ cloning oligo and template construction oligo) 6)5′-gtttcaatagagatataaaattctttcatag-3′ (SEQ ID NO:56) (3′ primer for PCR)7) 5′-ccgaagcttctaatacgactcactatagggag-3′ (SEQ ID NO:57) (5′ PCR primerfor cloning and for inhibition assay) 8)5′-agagatataaaattctttcatagnnnnttttcccgnnnnnnnncggaannctt- (SEQ ID NO:58)gattgtcagattaaaagtc-3′ (random template for SELEX experiment 3) 9)5′-gacgttgtaaaacgacggcc-3′ (SEQ ID NO:59) (3′ PCR and RT extensionprimer for inhibition assay)

[0274] TABLE 3 starting RNA (SEQ ID NO:60)5′-gggagcaucagacuuuuaaucugacaaucaag- [-32 n's-]-                             -aucuaugaaagaauuuuauaucucuauugaaac-3′isolate 1.1 ucaagAAUUCCGUUUUCAGUCGGGAAAAACUGAACAaucu (13) (SEQ ID NO:61)1.2 ucaagCGUAGGUUAUGAAUGGAGGAGGUAGGGUCGUAaucu (5) (SEQ ID NO:62) 1.3aucaagAAUAUCUUCCGAAGCCGAACGGGAAAACCGGCaucu (1) (SEQ ID NO:63) 1.3b-----------------G-----------------A----- (1) (SEQ ID NO:64) 1.3c---------C-------G----------------------- (1) (SEQ ID NO:65) 1.3d-----------------G--------------------C-- (1) (SEQ ID NO:66) 1.3e-----------------G--------------------A-- (1) (SEQ ID NO:67) 1.4ucaagGGCAUCUGGGAGGGUAAGGGUAAGGUUGUCGGaucu (4) (SEQ ID NO:68) 1.5ucaagCCCACGGAUGUCGAAGGUGGAGGUUGGGCGGCaucu (3) (SEQ ID NO:69) 1.6ucaagAAGAAGAUUACCCAAGCGCAGGGGAGAAGCGCaucu (2) (SEQ ID NO:70) 1.7ucaagGAAUCGACCCAAGCCAAAGGGGAUAAUGCGGCaucu (2) (SEQ ID NO:71) 1.8ucaagGAUUAACCGACGCCAACGGGAGAAUGGCAGGGaucu (2) (SEQ ID NO:72) 1.9aucaagAGAGUAUCAUC GUGCCGGCGGGAUAUCGGCGaucu (1) (SEQ ID NO:73) 1.9b----------------C------------------------ (1) (SEQ ID NO:74) 1.10aucaagUUUGAACAAGCGGAACAUGCACAGCUACACUCaucu (1) (SEQ ID NO:75) 1.10b------C----------------------C---------- (1) (SEQ ID NO:76) 1.11ucaagCUCACGGAUGUCGAAGGUGGAGGUUGGGCGGCAuc (1) (SEQ ID NO:77) 1.12ucaagCAUAGACCGCGUAGGGGGAGGUAGGAGCGGCCaucu (1) (SEQ ID NO:78) 1.13ucaagCUCUUUCAUAGACCGCGGAGGAGGUUGGGAGaucu (1) (SEQ ID NO:79) 1.14ucaagUUCCUAGUAGACUGAGGGUGGGAGUGGUGGAUGucu (1) (SEQ ID NO:80) 1.15ucaagCCAAUUACUUAUUUCGCCGACUAACCCCAAGAaucu (1) (SEQ ID NO:81) 1.16ucaagGAGGCCAAUUCCAUGUAACAAGGUGCAACUAAUaucu (1) (SEQ ID NO:82) 1.17ucaagUGCGUAUGAAGAGUAUUUAGUGCAGGCCACGCaucu (1) (SEQ ID NO:83) 1.18ucaagUAAUGACCAGAGGCCCAACUGGUAAACGGGCGGucu (1) (SEQ ID NO:84) 1.19ucaagAGACUCCACCUGACGUGUUCAACUAUCUGGCGaucu (1) (SEQ ID NO:85)

[0275] TABLE 4 Pseudoknot Motif 1.1ucaagAAUUCCGUUUUCAGUCGGGAAAAACUGAACAaucu (13) (SEQ ID NO:86) 1.3aucaagAAUAUCUUCCGAAGCCGAACGGGAAAACCGGCaucu (1) (SEQ ID NO:87) 2.9ucaagGUUUCCGAAAGAAAUCGGGAAAACUGucu (1) (SEQ ID NO:88) 2.4aucaagUAGAUAUCCGAACCUCAACGGGAUAAUGAGCaucu (3) (SEQ ID NO:89) 2.7aucaagAUAUGAUCCGUAAGAGGACGGGAUAAACCUCAa-cu (3) (SEQ ID NO:90) 1.7ucaagGAAUCGACCCAAGCCAAAGGGGAUAAUGCGGCaucu (2) (SEQ ID NO:91) 2.11ucaagUCAUAUUACCGUUACUCCUCGGGAUAAAGGAGaucu (1) (SEQ ID NO:92) 1.18ucaagUAAUGACCAGAGGCCCAACUGGUAAACGGGCGGucu (1) (SEQ ID NO:93) 1.8ucaagGAUUAACCGACGCCAA-CGGGAGAAUGGCAGGGaucu (2) (SEQ ID NO:94) 2.1bucaagAAUAUAUCCGAACUCGA-CGGGAUAACGAGAAGaGcu (7) (SEQ ID NO:95) 1.6ucaagAAGAAGAUUACCCAAGCGCA-GGGGAGAAGCGCaucu (2) (SEQ ID NO:96) 2.10ucaagUAAAUGAGUCCGUAGGAGG-CGGGAUAUCUCCAAcu (1) (SEQ ID NO:97) 1.9bucaagAGAGUAUCAUCCGUGCCGG--CGGGAUAUCGGCGaucu (1) (SEQ ID NO:98) 2.12ucaagAAUAAUCCGACUCG---CGGGAUAACGAGAAGAGcu (1) (SEQ ID NO:99) 1.10bucaagUUCGAACAAG--CGGAACAUGCACAGCCACACUCaucu (1) (SEQ ID NO:100) 2.3acaagUUAAACAUAAUCCGUGAUCUUUCACACGGGAGaucuaugaaaga (7) (SEQ ID NO:101)2.2b aaucaagUACCUAGG-UGAUAAAAGGGAGAACACGUGUGa-cu (1) (SEQ ID NO:102)2.2b aaucaagUACCUAGGUGAUAAA-AGGGAGAACACGUGUGa-cu (1) (SEQ ID NO:103)2.5a ucaagAUAGUAUCCGUUCUUGAUCAUCGGGACAAAUGaucu (3) (SEQ ID NO:104) 2.6bucaagUGAAACUUAACCGUUAUCAUAGAUCGGGACAAaucuaugaa (2) (SEQ ID NO:105)Nitrocellulose retention motif 1.2ucaagCGUAGGUUAUGAAUGGAGGAGGUAGGGUCGUAaucuaug (5) (SEQ ID NO:106) 1.4aucugacaaucaagGGCAUCUGGGAGGGUAAGGGUAAGGUUGUCGGaucu (4) (SEQ ID NO:107)1.5 ucaagCCCACGGAUGUCGAAGGUGGAGGUUGGGCGGCaucu (3) (SEQ ID NO:108) 1.11ucaagCUCACGGAUGUCGAAGGUGGAGGUUGGGCGGCAuc (1) (SEQ ID NO:109) 1.12ucaagCAUAGACCGCGUAGGGGGAGGUAGGAGCGGCCaucuaug (1) (SEQ ID NO:110) 1.13ucaagCUCUUUCAUAGACCGCGGAGGAGGUUGGGAGaucuaugaaaga (1) (SEQ ID NO:111)1.14 ucaagUUCCUAGUAGACUGAGGGUGGGAGUGGUGGAUGucuuau (1) (SEQ ID NO:112)

[0276] TABLE 5 Clone Freq. Stem 1(a) Loop 1 Stem 2(a) Loop 2 Stem 1(b)Loop 3 Stem 2(b) Seq. No. 2.3a 9* UCCGUG A UCUUUCA — CACGGG AGaucuaugaaaga SEQ ID NO:113 2.9 1 UUCCGA A AGA AA UCGGGA AAACUG ucu SEQ IDNO:114 2.7a 4* UCCGU UAA GAGG — ACGGG AUAAA CCUC SEQ ID NO:115 1.1 13UUCCG UU UUCAGU — CGGGA AAA ACUGAA SEQ ID NO:116 1.3c 5* UUCCG AG GCCGAA CGGGA AAAC CGGC SEQ ID NO:117 1.18 1 ACCAG AG GCCC AA CUGGU AAAC GGGCSEQ ID NO:118 2.4a 7* UCCG AA GCUCA A CGGG AUAA UGAGC SEQ ID NO:119 2.1b19* UCCG AA CUCG A CGGG AUAA CGAG SEQ ID NO:120 2.10 1 UCCG UA GGAGG —CGGG AUA UCUCC SEQ ID NO:121 1.90 2* UCCG — UGCCGG — CGGG AUA UCGGCG SEQID NO:122 2.12 1 UCCG A CUCG — CGGG AUAA CGAG SEQ ID NO:123 1.10b 2*UCCG AA CA AG CGGA ACA UG SEQ ID NO:124 2.5a 6* UCCG UUCUU GAUCAU — CGGGACAA AUGauc SEQ ID NO:125 2.11 1 CCG UUA CUCCU — CGG GAUAA AGGAG SEQ IDNO:126 1.8 2 CCG AC GCCA A CGG GAGAA UGGC SEQ ID NO:127 2.6b 5* CCG UUAUCAUAGAU — CGG GACAA aucuauga SEQ ID NO:128 1.7 2 CCC AA GCC AAA GGGGAUAAUGC GGC SEQ ID NO:129 1.6 2 CCC AA GCGC A GGG GAGAA GCGC SEQ IDNO:130 2.20 17* CCU AG GUG AUAAA AGG GAGAA CAC SEQ ID NO:131

[0277] TABLE 6 starting RNA (SEQ ID NO:132)5′-gggagccaacaccacaauuccaaucaag-[32n's-]-aucuaugaaagaauuuuauaucucuauugaaac-3′ isolate 2.1a ucaag  AAUAUAUCCGAACUCGACGGGAUAACGAGAA Gaucu (3) (SEQ ID NO:133) 2.1b ------------------------------------------G-- (7) (SEQ ID NO:134) 2.1c-----CA-----------------------------------G-- (1) (SEQ ID NO:135) 2.1d--------C---------------------------------G-- (1) (SEQ ID NO:136) 2.1e -----------------------------G------------G-- (1) (SEQ ID NO:137) 2.1f ------------------------------------------G-- (1) (SEQ ID NO:138) 2.1g ----------------------------------------C-C-- (1) (SEQ ID NO:139) 2.1h------------A------------------------------G-- (1) (SEQ ID NO:140) 2.1i------GU-----------------------------------G-- (1) (SEQ ID NO:141) 2.1j----------------------------------------A--G-- (1) (SEQ ID NO:142) 2.1k--------------C----------------------------G-- (1) (SEQ ID NO:143) 2.2aucaagUACCUAGGUGAUAAAGGGAGAACACGUGA acu (1) (SEQ ID NO:144) 2.2b---------------------------------UG--- (13) (SEQ ID NO:145) 2.2c------------------------------A---G--- (2) (SEQ ID NO:146) 2.2d----------------------------------G--- (1) (SEQ ID NO:147) 2.3aucaagUUAAACAUAAUCCGUGAUCUUUCACACGGGAGaucu (7) (SEQ ID NO:148) 2.3b--------------------------------------C-- (1) (SEQ ID NO:149) 2.3c----------------------------------A---A-- (1) (SEQ ID NO:150) 2.4aucaagUA GAUAUCCGAAGCUCAACGGGAUAAUGAGCaucu (3) (SEQ ID NO:151) 2.4b-----C-AAUU------------------------------- (1) (SEQ ID NO:152) 2.4c--------------------------------------G--- (1) (SEQ ID NO:153) 2.4d-----A------------------------------------ (1) (SEQ ID NO:154) 2.4e-----U--AU--------------U----------------- (1) (SEQ ID NO:155) 2.5aucaagAUAGUAUCCGUUCUUGAUCAUCGGGACAAAUGaucu (3) (SEQ ID NO:156) 2.5b------C---------------------------------- (1) (SEQ ID NO:157) 2.5c-----U----------------------------------- (1) (SEQ ID NO:158) 2.5d--------------A-------------------------- (1) (SEQ ID NO:159) 2.6aucaagUGAA CUUAACCGUUAUCAUAGAUCGGGACAAa cu (1) (SEQ ID NO:160) 2.6b---------A----------------------------u-- (2) (SEQ ID NO:161) 2.6c--------------------------------------u-- (1) (SEQ ID NO:162) 2.6d---------A---------------------U------u-- (1) (SEQ ID NO:163) 2.7aucaagAUAUG AUCCGUAAGAGGACGGGAUAAACCUCAacu (3) (SEQ ID NO:164) 2.7b----------U--------------------------G--- (1) (SEQ ID NO:165) 2.8ucaagGGGUAUUGAGAUAUUCCGAUGUCCUAUGCUGUaCcu (2) (SEQ ID NO:166) 2.9ucaagGUUUCCGAAAGAAAUCGGGAAAACUGucu (1) (SEQ ID NO:167) 2.10ucaagUAAAUGAGUCCGUAGGAGGCGGGAUAUCUCCAAcu (1) (SEQ ID NO:168) 2.11ucaagUCAUAUUACCGUUACUCCUCGGGAUAAAGGAGaucu (1) (SEQ ID NO:169) 2.12ucaagAAUAAUCCGACUCGCGGGAUAACGAGAAGAGcu (1) (SEQ ID NO:170) 2.13ucaagGAUAAGUGCAGGAAUAUCAAUGAGGCAUCCAAaCcu (1) (SEQ ID NO:171) 2.14ucaagAUGAGAUAAAGUACCAAUCGAACCUAUCUAAUACGAcu (1) (SEQ ID NO:172) 2.15ucaagACCCAUUUAUUGCUACAAUAAUCCUUGACCUCaucu (1) (SEQ ID NO:173) 2.16ucaagUAAUACGAUAUACUAAUGAAGCCUAAUCUCGaucu (1) (SEQ ID NO:174) 2.17ucaagAACGAUCAUCGAUAUCUCUUCCGAUCCGUUUGucu (1) (SEQ ID NO:175) 2.18ucaagACGAUAGAACAAUCAUCUCCUACGACGAUGCAcu (1) (SEQ ID NO:176) 2.19ucaagAUAAUCAUGCAGGAUCAUUGAUCUCUUGUGCUaucu (1) (SEQ ID NO:177) 2.20ucaagAGUGAAGAUGUAAGUGCUUAUCUCUUGGGACACaucu (1) (SEQ ID NO:178) 2.21ucaagCAACAUUCUAUCAAGUAAAGUCACAUGAUaucu (1) (SEQ ID NO:179) 2.22ucaagGAUGUAUUACGAUUACUCUAUACUGCCUGCaucu (1) (SEQ ID NO:180) 2.23ucaagGGAUGAAAAUAGUUCCUAGUCUCAUUACGACCAcu (1) (SEQ ID NO:181) 2.24ucaagUAGUGUGAUAAUGAAUGGGUUUAUCGUAUGUGGCcu (1) (SEQ ID NO:182) 1.1ucaagAAUUCCGUUUUCAGUCGGGAAAAACUGAACAaucu (17) (SEQ ID NO:183)

[0278] TABLE 7 starting RNA (SEQ ID NO:184)5′-gggagcaucagacuuuuuaaucugacaaucaagNNttccgNNNNNNNNcgggaaaaNNNN-                              cuaugaaagaauuuuauaucucuauugaaac-3′ isolate3-2 tcaagTAttccgAAGCTCAAcgggaaaaTGAGcta (SEQ ID NO:185) 3-3tcaagTAttccgAAGCTTGAcgggaaaaTAAGcta (SEQ ID NO:186) 3-6tcaagGAttccgAAGTTCAAcgggaaaaTGAActa (SEQ ID NO:187) 3-7tcaagAGttccgAAGGTTAAcgggaaaaTGACcta (SEQ ID NO:188) 3-25tcaagGAttccgAAGTGTAAcgggaaaaTGCActa (SEQ ID NO:189) 3-50tcaagTAttccgAGGTGCCACgggaaaaGGCActa (SEQ ID NO:190) 3-22tcaagTAttccgAAGGGTAAcgggaaaaTGCCcta (SEQ ID NO:191) 3-8tcaagTAttccgAAGTACAAcgggaaaaCGTActa (SEQ ID NO:192) 3-13tcaagGAttccgAAGTGTAAcgggaaaaCGCActa (SEQ ID NO:193) 3-23tcaagGAttccgAAGCATAAcgggaaaaCATGcta (SEQ ID NO:194) 3-43tcaggGAttccgAAGTGTAAcgggaaaaAGCActa (SEQ ID NO:195) 3-45tcaagTAttccgAGGTGTGAcgggaaaaGACActa (SEQ ID NO:196) 3-21tcaagTAttccgAAGGGTAAcgggaaaaTGACcta (SEQ ID NO:197) 3-9tcaagTGttccgAGAGGCAAcgggaaaaGAGCcta (SEQ ID NO:198) 3-37tcaagTAttccgAAGGTGAAcgggaaaaTACActa (SEQ ID NO:199) 3-56tcaagAGttccgAAAGTCGAcgggaaaaTAGActa (SEQ ID NO:200) 3-58tcaagATttccgAGAGACAAcgggaaaaGAGTcta (SEQ ID NO:201) 3-39tcaagATttccgATGTGCAAcgggaaaaTGCActa (SEQ ID NO:202) 3-33tcaagTAttccgACGTAACAcgggaaaaGTTActa (SEQ ID NO:203) 3-46tcaagATttccgACGCACAAcgggaaaaTGTGcta (SEQ ID NO:204) 3-52tcaagTAttccgATGTCTAAcgggaaaaTAGGcta (SEQ ID NO:205) 3-16tcaagGGttccgATGCCCAAcgggaaaaGGGGcta (SEQ ID NO:206) 3-34tcaagAAttccgACGACGAAcgggaaaaACGTcta (SEQ ID NO:207) 3-35tcaagTAttccgATGTACAAcgggaaaaAGTActa (SEQ ID NO:208) 3-60tccagCGttccgTAAGTGGAcgggaaaaACCActa (SEQ ID NO:209) 3-27tcaagAGttccgTAAGGCCAcgggaaaaAGGTcta (SEQ ID NO:210) 3-15tcaagGAttccgAAAGGTAAcgggaaaaATGCcta (SEQ ID NO:211) 3-18tcaagAAttccgCTAGCCCAcgggaaaaGGGCcta (SEQ ID NO:212) 3-31tcaagAAtt-cgTTAGTGTAcgggaaaaAACActa (SEQ ID NO:213) 3-26tcaagCGttccgATGGCTAAcgggaaaaATAGcta (SEQ ID NO:214) 3-32tcaagGAttccgTTTGTGCAcgggaaaaGGCActa (SEQ ID NO:215) 3-54tcaagAA-tccgTTTGCACAcgggaaaaCGTGcta (SEQ ID NO:216) 3-41tcaggAA-tccgAGAAGCTAcgggaaaaAGCGActa (SEQ ID NO:217) 3-29tcaagATttccgAGGTCCGAcgggaaaaTGGTcta (SEQ ID NO:218) 3-20tcaagTAttccgAAGGAAAAcgggaaaaCCACcta (SEQ ID NO:219) 3-36tcaagTGttccgAAGGAAAAcgggaaaaCCACcta (SEQ ID NO:220) 3-28tcaagAATtccgTAAGGGGTcgggaaaaACCctau (SEQ ID NO:221) 3-48tcaagGAttccgTATGTCCTcgggaaaaAGGActa (SEQ ID NO:222) 3-59tcaagAGttccgAAAGGTAAcgggaaaaTTACcta (SEQ ID NO:223) 3-12tcaagTAttccgATAGTCAAcgggaaaaGCGActa (SEQ ID NO:224) 3-30tcaagTAttccgAGGTGTTAcgggaaaaCACGcta (SEQ ID NO:225) 3-11tcaagAAttccgTATGTGATcgggaaaaACCActa (SEQ ID NO:226) 3-17tcaagGAttccgATGTACAAcgggaaaaCTGTcta (SEQ ID NO:227) 3-24tcaagATttccgAAGGATAAcgggaaaaACCGActa (SEQ ID NO:228) 3-51tcaagAAttccgAAGCGTAAcgggaaaaCATActa (SEQ ID NO:229)

[0279] TABLE 8 Template Construction: GGG AGCCA ACACC ACAAU UCCAA UCAAG-[32N]- AUCUA UGAAA GAAUU UUAUA UCUCU AUUGA AAC (SEQ ID NO:230) ΔofDownstream kcal Clone 32n Random Region Constant Region mol Clones withAUCA loops 1   CAG AGAUA UCACU UCUGU UCACC AUCA GGGGA    CUAUG AAAGA-13.0 (SEQ ID NO:231) 2   AU AUAAG UAAUG GAUGC GCACC AUCA GGGCGU AU CUAUGAAAGA- 19.0 (SEQ ID NO:232) 3  GGAAU AAGUG CUUUC GUCGA UCACC AUCA GGG AUCUAUG AAAGA- 17.5 (SEQ ID NO:233) 4  UGGAG UAUAA ACCUU UAUGG UCACC AUCAGGG AU CUAUG AAAGA- 13.3 (SEQ ID NO:234) 5   UCA GAGAU AGCUC AUAGG ACACCAUCA GGG  U CUAUG AAAGA- 13.6 (SEQ ID NO:235) 6  CUGA GAUAU AUGAC AGAGUCCACC AUCA GGG AU CUAUG AAAGA- 10.0 (SEQ ID NO:236) 7  GGAUU AAUAU GUCUGCAUGA UCACC AUCA GGG AU CUAUG AAAGA- 12.6 (SEQ ID NO:237) 8   G GGAGAUUCUU AGUAC UCACC AUCA GGGGGCA    CUAUG AAAGA- 12.6 (SEQ ID NO:238) 9  A AAUUA UCUUC GGAAU GCACC AUCA GGGCA UGG    CUAUG AAAGA- 10.9 (SEQ IDNO:239) 10   G GGAGA UUCUU ACUAC UCACC AUCA GGGGG CA    CUAUG AAAGA-10.3 (SEQ ID NO:240) 11  GGA AUACU UUCUU UCGAU GCACC AUCA GGGCG  U CUAUGAAAGA- 17.6 (SEQ ID NO:241) 12 UCCA AUAGA GUUAG UAGUU GCACC AUCA GGGC AUCUAUG AAAGA- 11.8 (SEQ ID NO:242) 13  GUAU AGAUA GUUCU ACUGA UCACG AUCACGGG  U CUAUG AAAGA- 9.7 (SEQ ID NO:243) 14  GGAU AUCAU CUUAU GGUAUGCACG AUCA CGGC AU CUAUG AAAGA- 17.5 (SEQ ID NO:244) 15  uUG UCUUU CAUGUAGUAA GCACG AUCA CGGCG  A CUAUG AAAAGA- 10.5 (SEQ ID NO:245) 16 AGAGCUAGUU CUUGU UUAAG ACACG AUCA CGG  U CUAUG AAAGA- 12.6 (SEQ ID NO:246) 17 ACG AGAUU UAUUU AGAUG UCACG AUCA CGGGC AC CUAUG AAAGA- 7.8 (SEQ IDNO:247) 18 UAAU  UGAUA CUUGC AGAGG AUCA CCCUG CUCG AU CUAUG AAAGA- 10.8(SEQ ID NO:248) 19  AG   AGGAC UCAUU AGAGG AUCA CCCUA GUGCG G  U CUAUGAAAGA- 15.0 (SEQ ID NO:249) 20 GAGAU AUCAU AAUUC AUUGU UGAGC AUCA GCC AUCUAUG AAAGA- 12.6 (SEQ ID NO:250) 21             UGUAU AGAGC AUCA GCCUAUACAU UGCGU GGC  A CUAUG AAAGA- 12.9 (SEQ ID NO:251) 22 GAGA UCAAU AGUAAGGACC AUCA GGCCU GG    CUAUG AAAGA- 14.6 (SEQ ID NO:252) 23 UGAG AUAUCUCUAU AGUGU GGAGC AUCA GCCC AU CUAUG AAAGA- 15.3 (SEQ ID NO:253) 24  AUGAGA UAGAU CAUGC UCAGG AUCA CCGGG CUAUG    AAAGA- 11.3 (SEQ ID NO:254)25 AGAG UAUUC UACAU GAUUU GCAUC AUCU GGGCG     UAUG AAAGA- 9.3 (SEQ IDNO:255) 26 GGAUU AAUUC GUCUU UUGAG UGACG AUCA CGC  A CUAUG AAAGA- 13.3(SEQ ID NO:256) 27  A    UUGCG UAAUG UUACC AUCA GGAAC ACCGC GU AU CUAUGAAAGA- 11.4 (SEQ ID NO:257) 28        GA   GUAAG AUAGC AUCA GCAUC UUGUUCCCGC C AU CUAUG AAAGA- 14.6 (SEQ ID NO:258) Clones with ANCA loops 29GCGUU AAUUU GGAUU AUAGA UCACC AACA GGG AC CUAUG AAAGA- 7.9 (SEQ IDNO:259) 30 GAGA UGUUU AGUAC UUCAG CCACC AACA GGGG  U CUAUG AAAGA- 14.2(SEQ ID NO:260) 31 GUCA UACUC UCUUU GUnnU GCACC AACA GGGC AU CUAUGAAAGA- 9.4 (SEQ ID NO:261) 32             AUAGU AGAGG AACA CCCUA CUAAGUCCCC GCC  A CUAUG AAAGA- 9.5 (SEQ ID NO:262) 33 CAACA GAGAU GAUAU CAGGAUGAGG ACCA CCC AU CUAUG AGGA- 11.8 (SEQ ID NO:263) 34 AGAUA UAAUU CUCCUCUUGA UGACC ACCA GCC AU CUAUG AAAGA- 18.5 (SEQ ID NO:264) 35  UAG AGAUAUGAGA UAGUU GCACC ACCA UGGUG AU CUAUG AAAGA- 16.8 (SEQ ID NO:265) 36 AUA UAGGA GAUAU UGUAG UCACG AGCA CGGG    CUAUG AAAGA- 12.5 (SEQ IDNO:266) Clones with no ANCA loop 37 UGCGUCACUUAUUGGAACUCUGGGUGGC A CUAUGAAAGA- 17.7 (SEQ ID NO:267) 38 CUGGAGGAGAUUGUGUAAUCGCUUGAACUCC A CUAUGAAAGA- 9.7 (SEQ ID NO:268)

[0280] TABLE 9 input RNA: 10,726 background: 72 PROTEIN COUNTS MOLARITY% BOUND Acetylcholinesterase 66 7.3 × 10⁻⁶ 0 88 3.7 × 10⁻⁶ 0 94 3.7 ×10⁻⁷ 0 N-acetyl-β-D-glucosaminidase 86 9.0 × 10⁻⁹ 0 84 4.5 × 10⁻⁹ 0 664.5 × 10⁻¹⁰ 0 Actin 70 1.9 × 10⁻⁵ 0 76 9.7 × 10⁻⁶ 0 58 9.7 × 10⁻⁷ 0Alcohol Dehydrogenase 1130 1.4 × 10⁻⁵ 10.5 116 7.0 × 10⁻⁶ 1.2 90 7.0 ×10⁻⁷ 0 Aldehyde Dehydrogenase 898 2.1 × 10⁻⁵ 8.4 1078 1.4 × 10⁻⁵ 10.1846 1.4 × 10⁻⁶ 7.9 Angiotensin I, human 284 2.6 × 10⁻³ 2.6 74 1.3 × 10⁻³0 70 1.3 × 10⁻⁴ 0 Ascorbate Oxidase 2734 2.4 × 10⁻⁵ 25.5 1308 1.2 × 10⁻⁵12.2 360 1.2 × 10⁻⁶ 3.4 Atrial Natriuretic Factor 4758 1.1 × 10⁻⁴ 44.44416 5.5 × 10⁻⁵ 41.2 4176 5.5 × 10⁻⁶ 38.9 Bombesin 1578 2.1 × 10⁻⁴ 14.7650 1.0 × 10⁻⁴ 6.1 116 1.0 × 10⁻⁵ 1.1

[0281] TABLE 10 sequence no. of number isolates 1atcaag----ATGAAGATACAGCTCCAGATGCTGGACACatct (1) (SEQ ID NO:269) 1b------G-G---------------T----------------- (1) (SEQ ID NO:270) 1c------GAG---------------T----------------- (9) (SEQ ID NO:271) 1d-----CGAG---------------T----------------- (1) (SEQ ID NO:272) 1e------GAG---------------TG---------------- (1) (SEQ ID NO:273) 2tcaagCTTGAGATACAGATTTCTGATTCTGGCTCGCTatct (5) (SEQ ID NO:274) 3atcaagATGGACTCGGTATCAAACGACCTTGAGACACatct (4) (SEQ ID NO:275) 3b------------------------G--------------- (1) (SEQ ID NO:276) 4atcaagATGGCTGGAGATACA-AACTATTTGGCTCGCCatct (3) (SEQ ID NO:277) 4b--------------------A-------------------- (1) (SEQ ID NO:278) 4c-------------------------G--------------- (1) (SEQ ID NO:279) 5tcaagAAGCCTTGAGATACACTATATAGTGGACCGGCatct (3) (SEQ ID NO:280) 6atcaagGGTGCATTGAGAAACACGTTTGTGGACTCTGT-atct (2) (SEQ ID NO:281) 6b-----A----------------------------G--G---- (2) (SEQ ID NO:282) 7atcaagAGCGAAGATACAGAAGACAATACTGGACACGC-atct (2) (SEQ ID NO:283) 7b-----------------------------------A-T---- (1) (SEQ ID NO:284) 8tcaagGGGACTCTTTTCAATGATCCTTTAACCAGTCGatct (2) (SEQ ID NO:285) 9atcaagAAGAGACATTCGAATGATCCCTTAACCGGTTGatct (1) (SEQ ID NO:286) 9b-------------C--------------------------- (1) (SEQ ID NO:287) 10 tcaagCACGCATGACACAGATAAACTGGACTACGTGCatct (1) (SEQ ID NO:288) 11 tcaagACACCTTGAGGTACTCTTAACAGGCTCGGTGatct (1) (SEQ ID NO:289) 12 tcaagTTGAGATACCTGAACTTGGGACTCCTTGGTTGatct (1) (SEQ ID NO:290) 13 tcaagGGATCTTGAGATACACACGAATGAGTGGACTCGatct (1) (SEQ ID NO:291) 14 tcaagATCGAATTGAGAAACACTAACTGGCCTCTTTGatct (1) (SEQ ID NO:292) 15 tcaagGCAGCAGATACAGGATATACTGGACACTGCCGatct (1) (SEQ ID NO:293) 16 tcaagGGATATAACGAGTGATCCAGGTAACTCTGTTGatct (1) (SEQ ID NO:294) 17 tcaagGTGGATTTGAGATACACGGAAGTGGACTCTCCatct (1) (SEQ ID NO:295) 18 tcaagAGATAATACAATGATCCTGCTCACTACAGTTGatct (1) (SEQ ID NO:296) 19 tcaagGGAGGTATACAGAATGATCCGGTTGCTCGTTGatct (1) (SEQ ID NO:297) 20 tcaagAGAAGAATAGTTGAAACAGATCAAACCTGGACatct (1) (SEQ ID NO:298)

[0282] TABLE 11 MOTIF IaagGGAUCUUGAGAUACACACGA---AUGAGUGGACUCGaucuaugaaa 13 (1) (SEQ. IDNO:299)         ----------             -------agGUGGAUUUGAGAUACACGG-------AAGUGGACUCUCCaucuauga 17 (1) (SEQ. IDNO:300)         ----------            -------agGGUGCAUUGAGAAACACGU-------UUGUGGACUCUGUaucuauga  6a (2) (SEQ. IDNO:301)         ----------             -------  -CGACCUUGAGACACaucu-3′ 5 -agAUGGACUCGGUAUCAAA-  3A (4) (SEQ. IDNO:302)         ----------              -------agAUCGAAUUGAGAAACACUA--------ACUGGCCUCUUUGaucuaug 14 (1) (SEQ. IDNO:303)         ----------             -------caaucaagUUGAGAUACCUGAA------CUUGGGACUCCUUGGUUGAUc 12 (1) (SEQ. IDNO:304)         ----------             -------aagAUGGCUGGAGAUACAAAAC-----UAUUUGG-CUCGCCaucuauga  4a (3) (SEQ. IDNO:305)         ----------             -------aagAAGCCUUGAGAUACACUAU-----AUAGUGGAC-CGGCaucuauga  5 (3) (SEQ. IDNO:306)         ----------             -------aaucaagCUUGAGAUACAGAUU-UCUGAUUCUGG-CUCGCUaUCUAUGA  2 (5) (SEQ. IDNO:307)         ----------             -------aagACACCUUGAGGUACUCUU-------AACAGG-CUCGGUGaucuaug 11 (1) (SEQ. IDNO:308)         ----------             ------- MOTIF IIucaagGAGAUGAAGAUACAGCUCUA-GAUGCUGGACACaucuauga  1C (9) (SEQ. ID NO:309)             -------          -------aaucaagAGCGAAGAUACAGAAGACAA--UACUGGACACGCaucuau  7A (2) (SEQ. ID NO:310)            -------          -------aaucaagGCAGCAGAUACAGGAU-----AUACUGGACACUGCCGAUc 15 (1) (SEQ. ID NO:311)            -------          -------gAGAAGAAUAGUUGAAACAGAUC----AAACCUGGACaucuaugaaa 20 (1) (SEQ. ID NO:312)            -------          -------aucaagCACGCAUGACACAGAUA------AACUGGACUACGUGCAUc 10 (1) (SEQ. ID NO:313)             -------          ------- MOTIF IIIcaaucaagAGAUAAUACAAUGAUCCUGCUCACUACAGUUGaucuaugaaagaauuuuaucucuau 18 (1)(SEQ. ID NO:314)                 --------            ----ucaagAAGAGACAUUCGAAUGAUCCCUU---AACCGGUUGaucuaugaaagaauuuuauaucucuau  9a(1) (SEQ. ID NO:315)                 --------            ----ucaagGGGACUCUUUUCAAUGAUCCUUU---AACCAGUCGaucuaugaaagaauuuuauaucucuau  8(2) (SEQ. ID NO:316)                 --------            ----ucaagGGAGGUAUACAGAAUGAUCCGGU---AACUCUGUUGaucuaugaaagaauuuuauaucucuau 19(1) (SEQ. ID NO:317)                 --------            ----aaucaagGGAUAUAACGAGUGAUCCAGGU-AACUCUGUUGaucuaugaagaauuuuauaucucuau 16(1) (SEQ. ID NO:318)                 --------            ----

[0283] TABLE 12 Clone Freq. Stem 1 (a) Loop 1 Stem 2 (a) Loop 2 Stem 2(b) Loop 3 Stem 1 (b) Seq. No. 13 1 GGAUC UUGAGUAUA CACA ACGAAUGA GUGGACUC Gaucu SEQ ID NO:319 12 1 caaucaag UUGAGAUA CC UGAACUU GG GACUCCUUGGUUG SEQ ID NO:320 6a 2 gGGUGCA UUGAGAA CACG UU UGUG GACUC UGUaucuSEQ ID NO:321 3a 4 ACC UUGAGACA Caucu -OPEN- agAUG GACUC GGU SEQ IDNO:322 14 1 agAUCGAA UUGAGAAA CA CUAAC UG GCCUC UUUGaucu SEQ ID NO:323 25 agC UUGAGAUA CAGA UUUCUGAU UCUG G-CUC GCU SEQ ID NO:324 11 1 CACCUUGAGGUA CU CUUAA AG G-CUC GGUG SEQ ID NO:325 4a 3 agAUGGC UGGAGAUA CAAACUA UUUG G-CUC GCCaucu SEQ ID NO:326 5 3 agA-A-GCC UUGAGAUA CACUAUAUAGUG GAC-C GGC-a-ucu SEQ ID ND:327 17 1 agGUGGA UUUGAGAUA CAC GGAA GUGGACUC UCCaucu SEQ ID NO:328 1c 9 AGAUG AAGAUA CAGC UCUAGAU GCUG GACACaucu SEQ ID NO:329 7a 2 agAGCG AAGAUA CAG AAGACAAUA CUG GACA CGC-a-ucuSEQ ID NO:330 15 1 gGCAG CAGAUA CAG GAUAUA CUD GACA CUDCC SEQ ID NO 33120 1 AUAG UUGAAA CAG AUCAAAC CUG GACau cuau SEQ ID NO:332 10 1 gCACGCAUGACA CAG AUAAA CUG GACUA CGUGC SEQ ID NO:333 WT . GACGCUG ACGGUA CA-OPEN- UG GGCG CAGCGUC SEQ ID NO:334 CON . GGGACCC UUGAGAUA CACGGC UUCGGCCGUG GACUC GGGUCUC SEQ ID NO:335

[0284] TABLE 13 AG₁ AG₁ Information Information Total ConsensusCompeting Score Score Information Clone Freq. Structure StructureSequence I Sequence II Score K_(α)/K_(A) ^(Gs) 13 1 −9.9 −9.7 10.2512.74 30.99 0.576 12 1 −0.9 −10.2 15.79 10.74 26.53 0.298 6a 2 −11.6−9.9 16.30 12.74 29.12 1.000 6b 2 — −11.5 16.30 12.74 29.12 0.042 3a 4−9.5 −0.0 14.79 12.74 27.53 0.405 14 1 −10.1 −10.2 16.30 10.16 26.54 — 25 −0.5 −9.1 18.25 3.07 22.12 0.114 11 1 −0.6 −10.9 13.21 6.93 20.14 — 4a3 −14.2 −15.6 17.40 7.46 24.86 0.564 5 3 −10.4 −10.8 10.25 7.03 26.000.567 17 1 −14.0 −12.0 10.25 12.74 30.99 0.247 1c 9 −10.0 −13.2 15.6711.42 27.09 0.154 7a 2 −6.4 −5.6 13.21 11.42 24.63 — 15 1 −11.1 −10.412.62 11.42 24.04 0.455 20 1 −4.0 −6.4 7.39 4.91 12.30 0.191 10 1 −10.2−0.6 6.60 10.16 16.76 — 8 — — — — — — 0.292 9a — — — — — — 0.106 10 — —— — — — 0.149 WT — −27.0 −26.3 — — — 0.100 COI^(v) — 21.1 −19.7 — — —0.429 32 n — — — — — — 0.015 Evol. — — — — — — 0.435 Pop.

[0285]

1 374 15 nucleotides nucleic acid single linear 1 NNNCGNAANU CGNNN 15 15nucleotides nucleic acid single linear 2 AAGUNNGUNN CNNNN 15 15nucleotides nucleic acid single linear 3 AAGUCCGUAA CACAC 15 45nucleotides nucleic acid single linear 4 TAATACGACT CACTATAGGGAGCCAACACC ACAATTCCAA TCAAG 45 30 nucleotides nucleic acid single linear5 GGGCTATAAA CTAAGGAATA TCTATGAAAG 30 37 nucleotides nucleic acid singlelinear 6 GAATTGTGGT GTTGGCTCCC TATAGTGAGT CGTATTA 37 42 nucleotidesnucleic acid single linear 7 ATATTCCTTA GTTTATAGCC CNNNNNNNNA GGCTCTTGATTG 42 31 nucleotides nucleic acid single linear 8 GTTTCAATAG AGATATAAAATTCTTTCATA G 31 20 nucleotides nucleic acid single linear 9 UUCCGNNNNNNNNCGGGAAA 20 31 nucleotides nucleic acid single linear 10 GTTTCAATAGAGATATAAAA TTCTTTCATA G 31 91 nucleotides nucleic acid single linear 11GGGAGCCAAC ACCACAAUUC CAAUCAAGNN NNNNNNNNNN NNNNNNNNNN 50 NNNNNNNNNNAUCUAUGAAA GAAUUUUAUC UCUAUUGAAA C 91 30 nucleotides nucleic acid singlelinear 12 ACGAAACAAA UAAGGAGGAG GAGGGAUUGU 30 30 nucleotides nucleicacid single linear 13 AGGAGGAGGA GGGAGAGCGC AAAUGAGAUU 30 30 nucleotidesnucleic acid single linear 14 AGGAGGAGGA GGUAGAGCAU GUAUUAAGAG 30 30nucleotides nucleic acid single linear 15 UAAGCAAGAA UCUACGAUAAAUACGUGAAC 30 30 nucleotides nucleic acid single linear 16 AGUGAAAGACGACAACGAAA AACGACCACA 30 29 nucleotides nucleic acid single linear 17CCGAGCAUGA GCCUAGUAAG UGGUGGAUA 29 30 nucleotides nucleic acid singlelinear 18 UAAUAAGAGA UACGACAGAA UACGACAUAA 30 30 nucleotides nucleicacid single linear 19 ACAUCGAUGA CCGGAAUGCC GCACACAGAG 30 30 nucleotidesnucleic acid single linear 20 CCUCAGAGCG CAAGAGUCGA ACGAAUACAG 30 30nucleotides nucleic acid single linear 21 CUCAUGGAGC GCAAGACGAAUAGCUACAUA 30 30 nucleotides nucleic acid single linear 22 ACAUCGAUGACCGGAAUGCC GCACACAGAG 30 30 nucleotides nucleic acid single linear 23CCUCAGAGCG CAAGAGUCGA ACGAAUACAG 30 30 nucleotides nucleic acid singlelinear 24 CGGGUGAUUA GUACUGCAGA GCGGAAUGAC 30 30 nucleotides nucleicacid single linear 25 UGCGAUAAGA CUUGCUGGGC GAGACAAACA 30 30 nucleotidesnucleic acid single linear 26 AGUCCACAGG GCACUCCCAA AGGGCAAACA 30 30nucleotides nucleic acid single linear 27 ACUCAUGGAG CGCUCGACGAUCACCAUCGA 30 29 nucleotides nucleic acid single linear 28 CAAGGGAGAAUGUCUGUAGC GUCCAGGUA 29 30 nucleotides nucleic acid single linear 29CGACGCAGAG AUACGAAUAC GACAAAACGC 30 30 nucleotides nucleic acid singlelinear 30 GAGGGUGAGG UGGGAGGCAG UGGCAGUUUA 30 30 nucleotides nucleicacid single linear 31 UGAACUAGGG GGAGGGAGGG UGGAAGACAG 30 29 nucleotidesnucleic acid single linear 32 GUGGAGGGGA CGUGGAGGGG AGAGCAAGA 29 30nucleotides nucleic acid single linear 33 CUCAUGGAGC GCAAGACGAAUAGCUACAUA 30 29 nucleotides nucleic acid single linear 34 CCAUAGAGGCCACAAGCAAA GACUACGCA 29 30 nucleotides nucleic acid single linear 35CCUACAAGAA AAGAGGGAAG GAGAAAAAAA 30 23 nucleotides nucleic acid singlelinear 36 GCCGGATCCG GGCCTCATGT GAA 23 48 nucleotides nucleic acidsingle linear 37 CCGAAGCTTA ATACGACTCA CTATAGGGAG CTCAGAATAA ACGCTCAA 4877 nucleotides nucleic acid single linear 38 GGGAGCUCAG AAUAAACGCUCAANNNNNNN NNNNNNNNNN NNNNNNNNNN 50 NNNUUCGACA UGAGGCCCGG AUCCGGC 77 43nucleotides nucleic acid single linear 39 CGCUCAAUAA GGAGGCCACGGACAACAUGG UACAGCUUCG ACA 43 42 nucleotides nucleic acid single linear40 CGCUCAAUAA GGAGGCCACA ACAAANGGAG ACAAAUUCGA CA 42 43 nucleotidesnucleic acid single linear 41 CGCUCAAUAA GGAGGCCACA CACAUAGGUAGACAUGUUCG ACA 43 45 nucleotides nucleic acid single linear 42CGCUCAAUAA GGAGGCCACA UACAAAAGGA UGAGUAAAUU CGACA 45 45 nucleotidesnucleic acid single linear 43 CGCUCAAUAA GGAGGCCACA AAUGCUGGUCCACCGAGAUU CGACA 45 44 nucleotides nucleic acid single linear 44CGCUCAAUAG GGAGGGCACG GGAAGGGUGA GUGGAUAUUC GACA 44 29 nucleotidesnucleic acid single linear 45 CGCUCAAUAA GGAGGCCACA AGUUCGACA 29 42nucleotides nucleic acid single linear 46 CGCUCAAUAA GGAGGGCCACAGAUGUAAUG GAAACUUCGA CA 42 46 nucleotides nucleic acid single linear 47CGCUCAAUAA GGAGGCCACA UACAAAAGGA UGAGUAAAAU UCGACA 46 10 nucleotidesnucleic acid single linear 48 UUGAGAUACA 10 45 nucleotides nucleic acidsingle linear 49 TAATACGACT CACTATAGGG AGCCAACACC ACAATTCCAA TCAAG 45 49nucleotides nucleic acid single linear 50 TAATACGACT CACTATAGGGAGCATCAGAC TTTTAATCTG ACAATCAAG 49 24 nucleotides nucleic acid singlelinear 51 ATCTATGAAA GAATTTTATA TCTC 24 37 nucleotides nucleic acidsingle linear 52 GAATTGTGGT GTTGGCTCCC TATAGTGAGT CGTATTA 37 41nucleotides nucleic acid single linear 53 TCAGATTAAA AGTCTGATGCTCCCTATAGT GAGTCGTATT A 41 50 nucleotides nucleic acid single linear 54TTTCATAGAT NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNCTTGATTG 50 31 nucleotidesnucleic acid single linear 55 CCGGATCCGT TTCAATAGAG ATATAAAATT C 31 31nucleotides nucleic acid single linear 56 GTTTCAATAG AGATATAAAATTCTTTCATA G 31 32 nucleotides nucleic acid single linear 57 CCGAAGCTTCTAATACGACT CACTATAGGG AG 32 72 nucleotides nucleic acid single linear 58AGAGATATAA AATTCTTTCA TAGNNNNTTT TCCCGNNNNN NNNCGGAANN 50 CTTGATTGTCAGATTAAAAG TC 72 20 nucleotides nucleic acid single linear 59 GACGTTGTAAAACGACGGCC 20 97 nucleotides nucleic acid single linear 60 GGGAGCAUCAGACUUUUAAU CUGACAAUCA AGNNNNNNNN NNNNNNNNNN 50 NNNNNNNNNN NNNNAUCUAUGAAAGAAUUU UAUAUCUCUA UUGAAAC 97 40 nucleotides nucleic acid singlelinear 61 UCAAGAAUUC CGUUUUCAGU CGGGAAAAAC UGAACAAUCU 40 41 nucleotidesnucleic acid single linear 62 UCAAGCGUAG GUUAUGAAUG GAGGAGGUAGGGUCGUAAUC U 41 41 nucleotides nucleic acid single linear 63 UCAAGAAUAUCUUCCGAAGC CGAACGGGAA AACCGGCAUC U 41 41 nucleotides nucleic acid singlelinear 64 UCAAGAAUAU CUUCCGAGGC CGAACGGGAA AACCGACAUC U 41 41nucleotides nucleic acid single linear 65 UCAAGAAUAC CUUCCGAGGCCGAACGGGAA AACCGGCAUC U 41 41 nucleotides nucleic acid single linear 66UCAAGAAUAU CUUCCGAGGC CGAACGGGAA AACCGGCACC U 41 41 nucleotides nucleicacid single linear 67 UCAAGAAUAU CUUCCGAGGC CGAACGGGAA AACCGGCAAC U 4141 nucleotides nucleic acid single linear 68 UCAAGGGCAU CUGGGAGGGUAAGGGUAAGG UUGUCGGAUC U 41 41 nucleotides nucleic acid single linear 69UCAAGCCCAC GGAUGUCGAA GGUGGAGGUU GGGCGGCAUC U 41 41 nucleotides nucleicacid single linear 70 UCAAGAAGAA GAUUACCCAA GCGCAGGGGA GAAGCGCAUC U 4141 nucleotides nucleic acid single linear 71 UCAAGGAAUC GACCCAAGCCAAAGGGGAUA AUGCGGCAUC U 41 41 nucleotides nucleic acid single linear 72UCAAGGAUUA ACCGACGCCA ACGGGAGAAU GGCAGGGAUC U 41 40 nucleotides nucleicacid single linear 73 UCAAGAGAGU AUCAUCGUGC CGGCGGGAUA UCGGCGAUCU 40 41nucleotides nucleic acid single linear 74 UCAAGAGAGU AUCAUCCGUGCCGGCGGGAU AUCGGCGAUC U 41 41 nucleotides nucleic acid single linear 75UCAAGUUUGA ACAAGCGGAA CAUGCACAGC UACACUCAUC U 41 41 nucleotides nucleicacid single linear 76 UCAAGUUCGA ACAAGCGGAA CAUGCACAGC CACACUCAUC U 4140 nucleotides nucleic acid single linear 77 UCAAGCUCAC GGAUGUCGAAGGUGGAGGUU GGGCGGCAUC 40 41 nucleotides nucleic acid single linear 78UCAAGCAUAG ACCGCGUAGG GGGAGGUAGG AGCGGCCAUC U 41 40 nucleotides nucleicacid single linear 79 UCAAGCUCUU UCAUAGACCG CGGAGGAGGU UGGGAGAUCU 40 41nucleotides nucleic acid single linear 80 UCAAGUUCCU AGUAGACUGAGGGUGGGAGU GGUGGAUGUC U 41 41 nucleotides nucleic acid single linear 81UCAAGCCAAU UACUUAUUUC GCCGACUAAC CCCAAGAAUC U 41 42 nucleotides nucleicacid single linear 82 UCAAGGAGGC CAAUUCCAUG UAACAAGGUG CAACUAAUAU CU 4241 nucleotides nucleic acid single linear 83 UCAAGUGCGU AUGAAGAGUAUUUAGUGCAG GCCACGGAUC U 41 41 nucleotides nucleic acid single linear 84UCAAGUAAUG ACCAGAGGCC CAACUGGUAA ACGGGCGGUC U 41 41 nucleotides nucleicacid single linear 85 UCAAGAGACU CCACCUGACG UGUUCAACUA UCUGGCGAUC U 4140 nucleotides nucleic acid single linear 86 UCAAGAAUUC CGUUUUCAGUCGGGAAAAAC UGAACAAUCU 40 41 nucleotides nucleic acid single linear 87UCAAGAAUAU CUUCCGAAGC CGAACGGGAA AACCGGCAUC U 41 34 nucleotides nucleicacid single linear 88 UCAAGGUUUC CGAAAGAAAU CGGGAAAACU GUCU 34 40nucleotides nucleic acid single linear 89 UCAAGUAGAU AUCCGAAGCUCAACGGGAUA AUGAGCAUCU 40 40 nucleotides nucleic acid single linear 90UCAAGAUAUG AUCCGUAAGA GGACGGGAUA AACCUCAACU 40 41 nucleotides nucleicacid single linear 91 UCAAGGAAUC GACCCAAGCC AAAGGGGAUA AUGCGGCAUC U 4141 nucleotides nucleic acid single linear 92 UCAAGUCAUA UUACCGUUACUCCUCGGGAU AAAGGAGAUC U 41 41 nucleotides nucleic acid single linear 93UCAAGUAAUG ACCAGAGGCC CAACUGGUAA ACGGGCGGUC U 41 41 nucleotides nucleicacid single linear 94 UCAAGGAUUA ACCGACGCCA ACGGGAGAAU GGCAGGGAUC U 4141 nucleotides nucleic acid single linear 95 UCAAGAAUAU AUCCGAACUCGACGGGAUAA CGAGAAGAGC U 41 41 nucleotides nucleic acid single linear 96UCAAGAAGAA GAUUACCCAA GCGCAGGGGA GAAGCGCAUC U 41 40 nucleotides nucleicacid single linear 97 UCAAGUAAAU GAGUCCGUAG GAGGCGGGAU AUCUCCAACU 40 41nucleotides nucleic acid single linear 98 UCAAGAGAGU AUCAUCCGUGCCGGCGGGAU AUCGGCGAUC U 41 38 nucleotides nucleic acid single linear 99UCAAGAAUAA UCCGACUCGC GGGAUAACGA GAAGAGCU 38 41 nucleotides nucleic acidsingle linear 100 UCAAGUUCGA ACAAGCGGAA CAUGCACAGC CACACUCAUC U 41 48nucleotides nucleic acid single linear 101 CAAGUUAAAC AUAAUCCGUGAUCUUUCACA CGGGAGAUCU AUGAAAGA 48 41 nucleotides nucleic acid singlelinear 102 AAUCAAGUAC CUAGGUGAUA AAAGGGAGAA CACGUGUGAC U 41 40nucleotides nucleic acid single linear 103 AAUCAAGUAC CUAGGUGAUAAAAGGGAGAA CACGUGUACU 40 41 nucleotides nucleic acid single linear 104UCAAGAUAGU AUCCGUUCUU GAUCAUCGGG ACAAAUGAUC U 41 46 nucleotides nucleicacid single linear 105 UCAAGUGAAA CUUAACCGUU AUCAUAGAUC GGGACAAAUCUAUGAA 46 44 nucleotides nucleic acid single linear 106 UCAAGCGUAGGUUAUGAAUG GAGGAGGUAG GGUCGUAAUC UAUG 44 50 nucleotides nucleic acidsingle linear 107 AUCUGACAAU CAAGGGCAUC UGGGAGGGUA AGGGUAAGGU UGUCGGAUCU50 41 nucleotides nucleic acid single linear 108 UCAAGCCCAC GGAUGUCGAAGGUGGAGGUU GGGCGGCAUC U 41 40 nucleotides nucleic acid single linear 109UCAAGCUCAC GGAUGUCGAA GGUGGAGGUU GGGCGGCAUC 40 44 nucleotides nucleicacid single linear 110 UCAAGCAUAG ACCGCGUAGG GGGAGGUAGG AGCGGCCAUC UAUG44 48 nucleotides nucleic acid single linear 111 UCAAGCUCUU UCAUAGACCGCGGAGGAGGU UGGGAGAUCU AUGAAAGA 48 43 nucleotides nucleic acid singlelinear 112 UCAAGUUCCU AGUAGACUGA GGGUGGGAGU GGUGGAUGUC UAU 43 34nucleotides nucleic acid single linear 113 UCCGUGAUCU UUCACACGGGAGAUCUAUGA AAGA 34 27 nucleotides nucleic acid single linear 114UUCCGAAAGA AAUCGGGAAA ACUGUCU 27 26 nucleotides nucleic acid singlelinear 115 UCCGUUAAGA GGACGGGAUA AACCUC 26 27 nucleotides nucleic acidsingle linear 116 UUCCGUUUUC AGUCGGGAAA AACUGAA 27 26 nucleotidesnucleic acid single linear 117 UUCCGAGGCC GAACGGGAAA ACCGGC 26 26nucleotides nucleic acid single linear 118 ACCAGAGGCC CAACUGGUAA ACGGGC26 25 nucleotides nucleic acid single linear 119 UCCGAAGCUC AACGGGAUAAUGAGC 25 23 nucleotides nucleic acid single linear 120 UCCGAACUCGACGGGAUAAC GAG 23 23 nucleotides nucleic acid single linear 121UCCGUAGGAG GCGGGAUAUC UCC 23 23 nucleotides nucleic acid single linear122 UCCGUGCCGG CGGGAUAUCG GCG 23 21 nucleotides nucleic acid singlelinear 123 UCCGACUCGC GGGAUAACGA G 21 19 nucleotides nucleic acid singlelinear 124 UUCGAACAAG CGGAACAUG 19 29 nucleotides nucleic acid singlelinear 125 UCCGUUCUUG AUCAUCGGGA CAAAUGAUC 29 24 nucleotides nucleicacid single linear 126 CCGUUACUCC UCGGGAUAAA GGAG 24 22 nucleotidesnucleic acid single linear 127 CCGACGCCAA CGGGAGAAUG GC 22 30nucleotides nucleic acid single linear 128 CCGUUAUCAU AGAUCGGGACAAAUCUAUGA 30 25 nucleotides nucleic acid single linear 129 CCCAAGCCAAAGGGGAUAAU GCGGC 25 22 nucleotides nucleic acid single linear 130CCCAAGCGCA GGGGAGAAGC GC 22 24 nucleotides nucleic acid single linear131 CCUAGGUGAU AAAAGGGAGA ACAC 24 93 nucleotides nucleic acid singlelinear 132 GGGAGCCAAC ACCACAAUUC CAAUCAAGNN NNNNNNNNNN NNNNNNNNNN 50NNNNNNNNNN AUCUAUGAAA GAAUUUUAUA UCUCUAUUGA AAC 93 41 nucleotidesnucleic acid single linear 133 UCAAGAAUAU AUCCGAACUC GACGGGAUAACGAGAAGAUC U 41 41 nucleotides nucleic acid single linear 134 UCAAGAAUAUAUCCGAACUC GACGGGAUAA CGAGAAGAGC U 41 43 nucleotides nucleic acid singlelinear 135 UCAAGCAAAU AUAUCCGAAC UCGACGGGAU AACGAGAAGA GCU 43 41nucleotides nucleic acid single linear 136 UCAAGAACAU AUCCGAACUCGACGGGAUAA CGAGAAGAGC U 41 41 nucleotides nucleic acid single linear 137UCAAGAAUAU AUCCGAACUC GACGGGGUAA CGAGAAGAGC U 41 41 nucleotides nucleicacid single linear 138 UCAAGAAUAU AUCCGAACUC GACGGGAUAA CGAGAAGAGC U 4141 nucleotides nucleic acid single linear 139 UCAAGAAUAU AUCCGAACUCGACGGGAUAA CGAGAACACC U 41 41 nucleotides nucleic acid single linear 140UCAAGAAUAA AUCCGAACUC GACGGGAUAA CGAGAAGAGC U 41 42 nucleotides nucleicacid single linear 141 UCAAGGUAUA UAUCCGAACU CGACGGGAUA ACGAGAAGAG CU 4242 nucleotides nucleic acid single linear 142 UCAAGAAUAU AUCCGAACUCGACGGGAUAA CGAGAAAGAG CU 42 42 nucleotides nucleic acid single linear143 UCAAGAAUAU ACUCCGAACU CGACGGGAUA ACGAGAAGAG CU 42 38 nucleotidesnucleic acid single linear 144 UCAAGUACCU AGGUGAUAAA AGGGAGAACA CGUGAACU38 39 nucleotides nucleic acid single linear 145 UCAAGUACCU AGGUGAUAAAAGGGAGAACA CGUGUGACU 39 39 nucleotides nucleic acid single linear 146UCAAGUACCU AGGUGAUAAA AGGGAGAACA CAUGAGACU 39 39 nucleotides nucleicacid single linear 147 UCAAGUACCU AGGUGAUAAA AGGGAGAACA CGUGAGACU 39 41nucleotides nucleic acid single linear 148 UCAAGUUAAA CAUAAUCCGUGAUCUUUCAC ACGGGAGAUC U 41 41 nucleotides nucleic acid single linear 149UCAAGUUAAA CAUAAUCCGU GAUCUUUCAC ACGGGAGACC U 41 41 nucleotides nucleicacid single linear 150 UCAAGUUAAA CAUAAUCCGU GAUCUUUCAC ACGGAAGAAC U 4140 nucleotides nucleic acid single linear 151 UCAAGUAGAU AUCCGAAGCUCAACGGGAUA AUGAGCAUCU 40 42 nucleotides nucleic acid single linear 152UCAAGCAAAU AUAUCCGAAG CUCAACGGGA UAAUGAGCAU CU 42 40 nucleotides nucleicacid single linear 153 UCAAGUAGAU AUCCGAAGCU CAACGGGAUA AUGAGCGUCU 40 40nucleotides nucleic acid single linear 154 UCAAGAAGAU AUCCGAAGCUCAACGGGAUA AUGAGCAUCU 40 41 nucleotides nucleic acid single linear 155UCAAGUAAUA UAUCCGAAGC UCAUCGGGAU AAUGAGCAUC U 41 41 nucleotides nucleicacid single linear 156 UCAAGAUAGU AUCCGUUCUU GAUCAUCGGG ACAAAUGAUC U 4141 nucleotides nucleic acid single linear 157 UCAAGACAGU AUCCGUUCUUGAUCAUCGGG ACAAAUGAUC U 41 41 nucleotides nucleic acid single linear 158UCAAGUUAGU AUCCGUUCUU GAUCAUCGGG ACAAAUGAUC U 41 41 nucleotides nucleicacid single linear 159 UCAAGAUAGU AUCCAUUCUU GAUCAUCGGG ACAAAUGAUC U 4139 nucleotides nucleic acid single linear 160 UCAAGUGAAC UUAACCGUUAUCAUAGAUCG GGACAAACU 39 41 nucleotides nucleic acid single linear 161UCAAGUGAAA CUUAACCGUU AUCAUAGAUC GGGACAAAUC U 41 40 nucleotides nucleicacid single linear 162 UCAAGUGAAC UUAACCGUUA UCAUAGAUCG GGACAAAUCU 40 41nucleotides nucleic acid single linear 163 UCAAGUGAAA CUUAACCGUUAUCAUAGAUC GUGACAAAUC U 41 40 nucleotides nucleic acid single linear 164UCAAGAUAUG AUCCGUAAGA GGACGGGAUA AACCUCAACU 40 41 nucleotides nucleicacid single linear 165 UCAAGAUAUG UAUCCGUAAG AGGACGGGAU AAACCUCGAC U 4141 nucleotides nucleic acid single linear 166 UCAAGGGGUA UUGAGAUAUUCCGAUGUCCU AUGCUGUACC U 41 34 nucleotides nucleic acid single linear 167UCAAGGUUUC CGAAAGAAAU CGGGAAAACU GUCU 34 40 nucleotides nucleic acidsingle linear 168 UCAAGUAAAU GAGUCCGUAG GAGGCGGGAU AUCUCCAACU 40 41nucleotides nucleic acid single linear 169 UCAAGUCAUA UUACCGUUACUCCUCGGGAU AAAGGAGAUC U 41 38 nucleotides nucleic acid single linear 170UCAAGAAUAA UCCGACUCGC GGGAUAACGA GAAGAGCU 38 41 nucleotides nucleic acidsingle linear 171 UCAAGGAUAA GUGCAGGAAU AUCAAUGAGG CAUCCAAACC U 41 43nucleotides nucleic acid single linear 172 UCAAGAUGAG AUAAAGUACCAAUCGAACCU AUCUAAUACG ACU 43 41 nucleotides nucleic acid single linear173 UCAAGACCCA UUUAUUGCUA CAAUAAUCCU UGACCUCAUC U 41 40 nucleotidesnucleic acid single linear 174 UCAAGUAAUA CGAUAUACUA AUGAAGCCUAAUCUCGAUCU 40 40 nucleotides nucleic acid single linear 175 UCAAGAACGAUCAUCGAUAU CUCUUCCGAU CCGUUUGUCU 40 39 nucleotides nucleic acid singlelinear 176 UCAAGACGAU AGAACAAUCA UCUCCUACGA CGAUGCACU 39 41 nucleotidesnucleic acid single linear 177 UCAAGAUAAU CAUGCAGGAU CAUUGAUCUCUUGUGCUAUC U 41 42 nucleotides nucleic acid single linear 178 UCAAGAGUGAAGAUGUAAGU GCUUAUCUCU UGGGACACAU CU 42 38 nucleotides nucleic acidsingle linear 179 UCAAGCAACA UUCUAUCAAG UAAAGUCACA UGAUAUCU 38 39nucleotides nucleic acid single linear 180 UCAAGGAUGU AUUACGAUUACUCUAUACUG CCUGCAUCU 39 40 nucleotides nucleic acid single linear 181UCAAGGGAUG AAAAUAGUUC CUAGUCUCAU UACGACCACU 40 41 nucleotides nucleicacid single linear 182 UCAAGUAGUG UGAUAAUGAA UGGGUUUAUC GUAUGUGGCC U 4140 nucleotides nucleic acid single linear 183 UCAAGAAUUC CGUUUUCAGUCGGGAAAAAC UGAACAAUCU 40 90 nucleotides nucleic acid single linear 184GGGAGCAUCA GACUUUUAAU CUGACAAUCA AGNNTTCCGN NNNNNNNCGG 50 GAAAANNNNCUAUGAAAGAA UUUUAUAUCU CUAUUGAAAC 90 35 nucleotides nucleic acid singlelinear 185 TCAAGTATTC CGAAGCTCAA CGGGAAAATG AGCTA 35 35 nucleotidesnucleic acid single linear 186 TCAAGTATTC CGAAGCTTGA CGGGAAAATA AGCTA 3535 nucleotides nucleic acid single linear 187 TCAAGGATTC CGAAGTTCAACGGGAAAATG AACTA 35 35 nucleotides nucleic acid single linear 188TCAAGAGTTC CGAAGGTTAA CGGGAAAATG ACCTA 35 35 nucleotides nucleic acidsingle linear 189 TCAAGGATTC CGAAGTGTAA CGGGAAAATG CACTA 35 35nucleotides nucleic acid single linear 190 TCAAGTATTC CGAGGTGCCACGGGAAAAGG CACTA 35 35 nucleotides nucleic acid single linear 191TCAAGTATTC CGAAGGGTAA CGGGAAAATG CCCTA 35 35 nucleotides nucleic acidsingle linear 192 TCAAGTATTC CGAAGTACAA CGGGAAAACG TACTA 35 35nucleotides nucleic acid single linear 193 TCAAGGATTC CGAAGTGTAACGGGAAAACG CACTA 35 35 nucleotides nucleic acid single linear 194TCAAGGATTC CGAAGCATAA CGGGAAAACA TGCTA 35 35 nucleotides nucleic acidsingle linear 195 TCAGGGATTC CGAAGTGTAA CGGGAAAAAG CACTA 35 35nucleotides nucleic acid single linear 196 TCAAGTATTC CGAGGTGTGACGGGAAAAGA CACTA 35 35 nucleotides nucleic acid single linear 197TCAAGTATTC CGAAGGGTAA CGGGAAAATG ACCTA 35 35 nucleotides nucleic acidsingle linear 198 TCAAGTGTTC CGAGAGGCAA CGGGAAAAGA GCCTA 35 35nucleotides nucleic acid single linear 199 TCAAGTATTC CGAAGGTGAACGGGAAAATA CACTA 35 35 nucleotides nucleic acid single linear 200TCAAGAGTTC CGAAAGTCGA CGGGAAAATA GACTA 35 35 nucleotides nucleic acidsingle linear 201 TCAAGATTTC CGAGAGACAA CGGGAAAAGA GTCTA 35 35nucleotides nucleic acid single linear 202 TCAAGATTTC CGATGTGCAACGGGAAAATG CACTA 35 35 nucleotides nucleic acid single linear 203TCAAGTATTC CGACGTAACA CGGGAAAAGT TACTA 35 35 nucleotides nucleic acidsingle linear 204 TCAAGATTTC CGACGCACAA CGGGAAAATG TGCTA 35 35nucleotides nucleic acid single linear 205 TCAAGTATTC CGATGTCTAACGGGAAAATA GGCTA 35 35 nucleotides nucleic acid single linear 206TCAAGGGTTC CGATGCCCAA CGGGAAAAGG GGCTA 35 35 nucleotides nucleic acidsingle linear 207 TCAAGAATTC CGACGACGAA CGGGAAAAAC GTCTA 35 35nucleotides nucleic acid single linear 208 TCAAGTATTC CGATGTACAACGGGAAAAAG TACTA 35 35 nucleotides nucleic acid single linear 209TCCAGCGTTC CGTAAGTGGA CGGGAAAAAC CACTA 35 35 nucleotides nucleic acidsingle linear 210 TCAAGAGTTC CGTAAGGCCA CGGGAAAAAG GTCTA 35 35nucleotides nucleic acid single linear 211 TCAAGGATTC CGAAAGGTAACGGGAAAAAT GCCTA 35 35 nucleotides nucleic acid single linear 212TCAAGAATTC CGCTAGCCCA CGGGAAAAGG GCCTA 35 34 nucleotides nucleic acidsingle linear 213 TCAAGAATTC GTTAGTGTAC GGGAAAAAAC ACTA 34 35nucleotides nucleic acid single linear 214 TCAAGCGTTC CGATGGCTAACGGGAAAAAT AGCTA 35 35 nucleotides nucleic acid single linear 215TCAAGGATTC CGTTTGTGCA CGGGAAAAGG CACTA 35 34 nucleotides nucleic acidsingle linear 216 TCAAGAATCC GTTTGCACAC GGGAAAACGT GCTA 34 35nucleotides nucleic acid single linear 217 TCAGGAATCC GAGAAGCTACGGGAAAAAGC GACTA 35 35 nucleotides nucleic acid single linear 218TCAAGATTTC CGAGGTCCGA CGGGAAAATG GTCTA 35 35 nucleotides nucleic acidsingle linear 219 TCAAGTATTC CGAAGGAAAA CGGGAAAACC ACCTA 35 35nucleotides nucleic acid single linear 220 TCAAGTGTTC CGAAGGAAAACGGGAAAACC ACCTA 35 35 nucleotides nucleic acid single linear 221TCAAGAATTC CGTAAGGGGT CGGGAAAAAC CCTAU 35 35 nucleotides nucleic acidsingle linear 222 TCAAGGATTC CGTATGTCCT CGGGAAAAAG GACTA 35 35nucleotides nucleic acid single linear 223 TCAAGAGTTC CGAAAGGTAACGGGAAAATT ACCTA 35 35 nucleotides nucleic acid single linear 224TCAAGTATTC CGATAGTCAA CGGGAAAAGC GACTA 35 35 nucleotides nucleic acidsingle linear 225 TCAAGTATTC CGAGGTGTTA CGGGAAAACA CGCTA 35 35nucleotides nucleic acid single linear 226 TCAAGAATTC CGTATGTGATCGGGAAAAAC CACTA 35 35 nucleotides nucleic acid single linear 227TCAAGGATTC CGATGTACAA CGGGAAAACT GTCTA 35 36 nucleotides nucleic acidsingle linear 228 TCAAGATTTC CGAAGGATAA CGGGAAAAAC CGACTA 36 35nucleotides nucleic acid single linear 229 TCAAGAATTC CGAAGCGTAACGGGAAAACA TACTA 35 93 nucleotides nucleic acid single linear 230GGGAGCCAAC ACCACAAUUC CAAUCAAGNN NNNNNNNNNN NNNNNNNNNN 50 NNNNNNNNNNAUCUAUGAAA GAAUUUUAUA UCUCUAUUGA AAC 93 42 nucleotides nucleic acidsingle linear 231 CAGAGAUAUC ACUUCUGUUC ACCAUCAGGG GACUAUGAAA GA 42 44nucleotides nucleic acid single linear 232 AUAUAAGUAA UGGAUGCGCACCAUCAGGGC GUAUCUAUGA AAGA 44 44 nucleotides nucleic acid single linear233 GGAAUAAGUG CUUUCGUCGA UCACCAUCAG GGAUCUAUGA AAGA 44 44 nucleotidesnucleic acid single linear 234 UGGAGUAUAA ACCUUUAUGG UCACCAUCAGGGAUCUAUGA AAGA 44 41 nucleotides nucleic acid single linear 235UCAGAGAUAG CUCAUAGGAC ACCAUCAGGG UCUAUGAAAG A 41 43 nucleotides nucleicacid single linear 236 CUGAGAUAUA UGACAGAGUC CACCAUCAGG GAUCUAUGAA AGA43 44 nucleotides nucleic acid single linear 237 GGAUUAAUAU GUCUGCAUGAUCACCAUCAG GGAUCUAUGA AAGA 44 42 nucleotides nucleic acid single linear238 GGGAGAUUCU UAGUACUCAC CAUCAGGGGG CACUAUGAAA GA 42 43 nucleotidesnucleic acid single linear 239 AAAUUAUCUU CGGAAUGCAC CAUCAGGGCAUGGCUAUGAA AGA 43 42 nucleotides nucleic acid single linear 240GGGAGAUUCU UACUACUCAC CAUCAGGGGG CACUAUGAAA GA 42 43 nucleotides nucleicacid single linear 241 GGAAUACUUU CUUUCGAUGC ACCAUCAGGG CGUCUAUGAA AGA43 44 nucleotides nucleic acid single linear 242 UCCAAUAGAG UUAGUAGUUGCACCAUCAGG GCAUCUAUGA AAGA 44 43 nucleotides nucleic acid single linear243 GUAUAGAUAG UUCUACUGAU CACGAUCACG GGUCUAUGAA AGA 43 44 nucleotidesnucleic acid single linear 244 GGAUAUGAUC UUAUGGUAUG CACGAUCACGGCAUCUAUGA AAGA 44 44 nucleotides nucleic acid single linear 245UUGUCUUUCA UGUAGUAAGC ACGAUCACGG CGACUAUGAA AAGA 44 43 nucleotidesnucleic acid single linear 246 AGAGCUAGUU CUUGUUUAAG ACACGAUCACGGUCUAUGAA AGA 43 44 nucleotides nucleic acid single linear 247ACGAGAUUUA UUUAGAUGUC ACGAUCACGG GCACCUAUGA AAGA 44 44 nucleotidesnucleic acid single linear 248 UAAUUGAUAC UUGCAGAGGA UCACCCUGCUCGAUCUAUGA AAGA 44 43 nucleotides nucleic acid single linear 249AGAGGACUCA UUAGAGGAUC ACCCUAGUGC GGUCUAUGAA AGA 43 44 nucleotidesnucleic acid single linear 250 GAGAUAUCAU AAUUCAUUGU UGAGCAUCAGCCAUCUAUGA AAGA 44 43 nucleotides nucleic acid single linear 251UGUAUAGAGC AUCAGCCUAU ACAUUGCGUG GCACUAUGAA AGA 43 40 nucleotidesnucleic acid single linear 252 GAGAUCAAUA GUAAGGACCA UCAGGCCUGGCUAUGAAAGA 40 44 nucleotides nucleic acid single linear 253 UGAGAUAUCUCUAUAGUGUG GAGCAUCAGC CCAUCUAUGA AAGA 44 40 nucleotides nucleic acidsingle linear 254 AUGAGAUAGA UCAUGCUCAG GAUCACCGGG CUAUGAAAGA 40 42nucleotides nucleic acid single linear 255 AGAGUAUUCU ACAUGAUUUGCAUCAUCUGG GCGUAUGAAA GA 42 43 nucleotides nucleic acid single linear256 GGAUUAAUUC GUCUUUUGAG UGACGAUCAC GCACUAUGAA AGA 43 44 nucleotidesnucleic acid single linear 257 AUUGCGUAAU GUUACCAUCA GGAACACCGCGUAUCUAUGA AAGA 44 44 nucleotides nucleic acid single linear 258GAGUAAGAUA GCAUCAGCAU CUUGUUCCCG CCAUCUAUGA AAGA 44 44 nucleotidesnucleic acid single linear 259 GCGUUAAUUU GGAUUAUAGA UCACCAACAGGGACCUAUGA AAGA 44 43 nucleotides nucleic acid single linear 260GAGAUGUUUA GUACUUCAGC CACCAACAGG GGUCUAUGAA AGA 43 44 nucleotidesnucleic acid single linear 261 GUCAUACUCU CUUUGUNNUG CACCAACAGGGCAUCUAUGA AAGA 44 43 nucleotides nucleic acid single linear 262AUAGUAGAGG AACACCCUAC UAAGUCCCCG CCACUAUGAA AGA 43 43 nucleotidesnucleic acid single linear 263 CAACAGAGAU GAUAUCAGGA UGAGGACCACCCAUCUAUGA GGA 43 44 nucleotides nucleic acid single linear 264AGAUAUAAUU CUCCUCUUGA UGAGCACCAG CCAUCUAUGA AAGA 44 44 nucleotidesnucleic acid single linear 265 UAGAGAUAUG AGAUAGUUGC ACCACCAGGGUGAUCUAUGA AAGA 44 41 nucleotides nucleic acid single linear 266AUAUAGGAGA UAUUGUAGUC ACGAGCACGG GCUAUGAAAG A 41 39 nucleotides nucleicacid single linear 267 UGCGUCACUU AUUGGAACUC UGGGUGGCAC UAUGAAAGA 39 42nucleotides nucleic acid single linear 268 CUGGAGGAGA UUGUGUAAUCGCUUGAACUC CACUAUGAAA GA 42 38 nucleotides nucleic acid single linear269 TCAAGATGAA GATACAGCTC CAGATGCTGG ACACATCT 38 40 nucleotides nucleicacid single linear 270 TCAAGGGATG AAGATACAGC TCTAGATGCT GGACACATCT 40 41nucleotides nucleic acid single linear 271 TCAAGGAGAT GAAGATACAGCTCTAGATGC TGGACACATC T 41 42 nucleotides nucleic acid single linear 272TCAAGCGAGA TGAAGATACA GCTCCAGATG CTGGACACAT CT 42 41 nucleotides nucleicacid single linear 273 TCAAGGAGAT GAAGATACAG CTCTGGATGC TGGACACATC T 4141 nucleotides nucleic acid single linear 274 TCAAGCTTGA GATACAGATTTCTGATTCTG GCTCGCTATC T 41 40 nucleotides nucleic acid single linear 275TCAAGATGGA CTCGGTATCA AACGACCTTG AGACACATCT 40 40 nucleotides nucleicacid single linear 276 TCAAGATGGA CTCGGTATCA AACGGCCTTG AGACACATCT 40 40nucleotides nucleic acid single linear 277 TCAAGATGGC TGGAGATACAAACTATTTGG CTCGCCATCT 40 41 nucleotides nucleic acid single linear 278TCAAGATGGC TGGAGATACA AAACTATTTG GCTCGCCATC T 41 40 nucleotides nucleicacid single linear 279 TCAAGATGGC TGGAGATACA AACTGTTTGG CTCGCCATCT 40 41nucleotides nucleic acid single linear 280 TCAAGAAGCC TTGAGATACACTATATAGTG GACCGGCATC T 41 41 nucleotides nucleic acid single linear 281TCAAGGGTGC ATTGAGAAAC ACGTTTGTGG ACTCTGTATC T 41 42 nucleotides nucleicacid single linear 282 TCAAGAGTGC ATTGAGAAAC ACGTTTGTGG ACTCGGTGAT CT 4241 nucleotides nucleic acid single linear 283 TCAAGAGCGA AGATACAGAAGACAATACTG GACACGCATC T 41 42 nucleotides nucleic acid single linear 284TCAAGAGCGA AGATACAGAA GACAATACTG GACACACTAT CT 42 41 nucleotides nucleicacid single linear 285 TCAAGGGGAC TCTTTTCAAT GATCCTTTAA CCAGTCGATC T 4141 nucleotides nucleic acid single linear 286 TCAAGAAGAG ACATTCGAATGATCCCTTAA CCGGTTGATC T 41 41 nucleotides nucleic acid single linear 287TCAAGAAGAG ACACTCGAAT GATCCCTTAA CCGGTTGATC T 41 41 nucleotides nucleicacid single linear 288 TCAAGCACGC ATGACACAGA TAAACTGGAC TACGTGCATC T 4140 nucleotides nucleic acid single linear 289 TCAAGACACC TTGAGGTACTCTTAACAGGC TCGGTGATCT 40 41 nucleotides nucleic acid single linear 290TCAAGTTGAG ATACCTGAAC TTGGGACTCC TTGGTTGATC T 41 42 nucleotides nucleicacid single linear 291 TCAAGGGATC TTGAGATACA CACGAATGAG TGGACTCGAT CT 4241 nucleotides nucleic acid single linear 292 TCAAGATCGA ATTGAGAAACACTAACTGGC CTCTTTGATC T 41 41 nucleotides nucleic acid single linear 293TCAAGGCAGC AGATACAGGA TATACTGGAC ACTGCCGATC T 41 41 nucleotides nucleicacid single linear 294 TCAAGGGATA TAACGAGTGA TCCAGGTAAC TCTGTTGATC T 4141 nucleotides nucleic acid single linear 295 TCAAGGTGGA TTTGAGATACACGGAAGTGG ACTCTCCATC T 41 41 nucleotides nucleic acid single linear 296TCAAGAGATA ATACAATGAT CCTGCTCACT ACAGTTGATC T 41 41 nucleotides nucleicacid single linear 297 TCAAGGGAGG TATACAGAAT GATCCGGTTG CTCGTTGATC T 4141 nucleotides nucleic acid single linear 298 TCAAGAGAAG AATAGTTGAAACAGATCAAA CCTGGACATC T 41 46 nucleotides nucleic acid single linear 299AAGGGAUCUU GAGAUACACA CGAAUGAGUG GACUCGAUCU AUGAAA 46 42 nucleotidesnucleic acid single linear 300 AGGUGGAUUU GAGAUACACG GAAGUGGACUCUCCAUCUAU GA 42 42 nucleotides nucleic acid single linear 301AGGGUGCAUU GAGAAACACG UUUGUGGACU CUGUAUCUAU GA 42 37 nucleotides nucleicacid single linear 302 CGACCUUGAG ACACAUCUAG AUGGACUCGG UAUCAAA 37 41nucleotides nucleic acid single linear 303 AGAUCGAAUU GAGAAACACUAACUGGCCUC UUUGAUCUAU G 41 43 nucleotides nucleic acid single linear 304CAAUCAAGUU GAGAUACCUG AACUUGGGAC UCCUUGGUUG AUC 43 43 nucleotidesnucleic acid single linear 305 AAGAUGGCUG GAGAUACAAA ACUAUUUGGCUCGCCAUCUA UGA 43 43 nucleotides nucleic acid single linear 306AAGAAGCCUU GAGAUACACU AUAUAGUGGA CCGGCAUCUA UGA 43 47 nucleotidesnucleic acid single linear 307 AAUCAAGCUU GAGAUACAGA UUUCUGAUUCUGGCUCGCUA UCUAUGA 47 41 nucleotides nucleic acid single linear 308AAGACACCUU GAGGUACUCU UAACAGGCUC GGUGAUCUAU G 41 45 nucleotides nucleicacid single linear 309 UCAAGGAGAU GAAGAUACAG CUCUAGAUGC UGGACACAUC UAUGA45 45 nucleotides nucleic acid single linear 310 AAUCAAGAGC GAAGAUACAGAAGACAAUAC UGGACACGCA UCUAU 45 42 nucleotides nucleic acid single linear311 AAUCAAGGCA GCAGAUACAG GAUAUACUGG ACACUGCCGA UC 42 43 nucleotidesnucleic acid single linear 312 GAGAAGAAUA GUUGAAACAG AUCAAACCUGGACAUCUAUG AAA 43 41 nucleotides nucleic acid single linear 313AUCAAGCACG CAUGACACAG AUAAACUGGA CUACGUGCAU C 41 67 nucleotides nucleicacid single linear 314 CAAUCAAGAG AUAAUACAAU GAUCCUGCUC ACUACAGUUGAUCUAUGAAA 50 GAAUUUUAUA UCUCUAU 67 64 nucleotides nucleic acid singlelinear 315 UCAAGAAGAG ACAUUCGAAU GAUCCCUUAA CCGGUUGAUC UAUGAAAGAA 50UUUUAUAUCU CUAU 64 64 nucleotides nucleic acid single linear 316UCAAGGGGAC UCUUUUCAAU GAUCCUUUAA CCAGUCGAUC UAUGAAAGAA 50 UUUUAUAUCUCUAU 64 64 nucleotides nucleic acid single linear 317 UCAAGGGAGGUAUACAGAAU GAUCCGGUUG CUCGUUGAUC UAUGAAAGAA 50 UUUUAUAUCU CUAU 64 66nucleotides nucleic acid single linear 318 AAUCAAGGGA UAUAACGAGUGAUCCAGGUA ACUCUGUUGA UCUAUGAAAG 50 AAUUUUAUAU CUCUAU 66 37 nucleotidesnucleic acid single linear 319 GGAUCUUGAG AUACACACGA AUGAGUGGAC UCGAUCU37 40 nucleotides nucleic acid single linear 320 CAAUCAAGUU GAGAUACCUGAACUUGGGAC UCCUUGGUUG 40 37 nucleotides nucleic acid single linear 321GGGUGCAUUG AGAAACACGU UUGUGGACUC UGUAUCU 37 29 nucleotides nucleic acidsingle linear 322 ACCUUGAGAC ACAUCUAGAU GGACUCGGU 29 38 nucleotidesnucleic acid single linear 323 AGAUCGAAUU GAGAAACACU AACUGGCCUC UUUGAUCU38 34 nucleotides nucleic acid single linear 324 AGCUUGAGAU ACAGAUUUCUGAUUCUGGCU CGCU 34 30 nucleotides nucleic acid single linear 325CACCUUGAGG UACUCUUAAC AGGCUCGGUG 30 37 nucleotides nucleic acid singlelinear 326 AGAUGGCUGG AGAUACAAAC UAUUUGGCUC GCCAUCU 37 38 nucleotidesnucleic acid single linear 327 AGAAGCCUUG AGAUACACUA UAUAGUGGAC CGGCAUCU38 38 nucleotides nucleic acid single linear 328 AGGUGGAUUU GAGAUACACGGAAGUGGACU CUCCAUCU 38 35 nucleotides nucleic acid single linear 329AGAUGAAGAU ACAGCUCUAG AUGCUGGACA CAUCU 35 38 nucleotides nucleic acidsingle linear 330 AGAGCGAAGA UACAGAAGAC AAUACUGGAC ACGCAUCU 38 32nucleotides nucleic acid single linear 331 GGCAGCAGAU ACAGGAUAUACUGGACACUG CC 32 32 nucleotides nucleic acid single linear 332AUAGUUGAAA CAGAUCAAAC CUGGACAUCU AU 32 33 nucleotides nucleic acidsingle linear 333 GCACGCAUGA CACAGAUAAA CUGGACUACG UGC 33 28 nucleotidesnucleic acid single linear 334 GACGCUGACG GUACAUGGGC GCAGCGUC 28 43nucleotides nucleic acid single linear 335 GGGACCCUUG AGAUACACGGCUUCGGCCGU GGACUCGGGU CUC 43 28 nucleotides nucleic acid single linear336 NNNGAGCCUA GCAACCUGGG CUAGGAAU 28 27 nucleotides nucleic acid singlelinear N 7-12 This symbol stands for the complimentary base for the Y′slocated in positions 22-27 337 UUCCGANNNN NNACGGGANA AYYYYYY 27 15nucleotides nucleic acid single linear 338 NNCACCAUC AGGGNN 15 17nucleotides nucleic acid single linear 339 GAGCGCAAGA CGAAUAG 17 12nucleotides nucleic acid single linear 340 UAAGGAGGCC AC 12 34nucleotides nucleic acid single linear N 6 AND 18 This symbol stands forA or U 341 NNYRSNGACA CGAANNCNSY RNGGAACNNU CGNN 34 29 nucleotidesnucleic acid single linear N 1-5, 16 This symbol stands for thecomplimentary base for the Y′s located at positions 17, and 25-29 342NNNNNUUGAG ANACANYUGG ACUCYYYYY 29 26 nucleotides nucleic acid singlelinear N 1-4 This symbol stands for the complimentary base for the Y′slocated in positions 22-25 343 NNNNGNNGAN ACAGCUGGAC ACYYYY 26 33nucleotides nucleic acid single linear 344 UUCGAAUGAU CCCUUAACCGGUUGAUCUAU GAA 33 51 nucleotides nucleic acid single linear 345UAAUAUAUCA AGAGCCUAAU AACUCGGGCU AUAAACUAAG GAAUAUCUAU 50 G 51 20nucleotides nucleic acid single linear 346 GAGCCUNNNN NNNNGGGCUA 20 20nucleotides nucleic acid single linear 347 GAGCCUARYA ACYYGGGCUA 20 20nucleotides nucleic acid single linear 348 GAGCCUAAUA ACUCGGGCUA 20 20nucleotides nucleic acid single linear 349 GAGCCUAGCA ACCUGGGCUA 20 73nucleotides nucleic acid single linear 350 TAATACGACT CACTATAGGGAGCATCAGAC TTTTAATCTG ACAATCAAGA 50 TCTATGAAAG AATTTTATAT CTC 73 41nucleotides nucleic acid single linear 351 TCAGATTAAA AGTCTGATGCTCCCTATAGT GAGTCGTATT A 41 31 nucleotides nucleic acid single linear 352CCGGATCCGT TTCAATAGAG ATATAAAATT C 31 50 nucleotides nucleic acid singlelinear 353 TTTCATAGAT NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNCTTGATTG 50 105nucleotides nucleic acid single linear 354 GGGAGCAUCA GACUUUUAAUCUGACAAUCA AGNNNNNNNN NNNNNNNNNN 50 NNNNNNNNNN NNNNAUCUAU GAAAGAAUUUUAUAUCUCUA UUGAAACGGA 100 UCCGG 105 40 nucleotides nucleic acid singlelinear 355 UCAAGAAUUC CGUUUUCAGU CGGGAAAAAC UGAACAAUCU 40 41 nucleotidesnucleic acid single linear 356 UCAAGAAUAU CUUCCGAAGC CGAACGGGAAAACCGGCAUC U 41 41 nucleotides nucleic acid single linear 357 UCAAGAAUAUCUUCCGAGGC CGAACGGGAA AACCGACAUC U 41 41 nucleotides nucleic acid singlelinear 358 UCAAGGGCAU CUGGGAGGGU AAGGGUAAGG UUGUCGGAUC U 41 41nucleotides nucleic acid single linear 359 UCAAGAAUAU AUCCGAACUCGACGGGAUAA CGAGAAGAUC U 41 39 nucleotides nucleic acid single linear 360UCAAGUACCU AGGUGAUAAA AGGGAGAACA CGUGUGACU 39 41 nucleotides nucleicacid single linear 361 UCAAGACAGU AUCCGUUCUU GAUCAUCGGG ACAAAUGAUC U 4140 nucleotides nucleic acid single linear 362 UCAAGAAUUC CGUUUUCAGUCGGGAAAAAC UGAACAAUCU 40 30 nucleotides nucleic acid single linear 363AGUCGGGAAA AACUGAACAA UCUAUGAAAG 30 30 nucleotides nucleic acid singlelinear 364 GCCGAACGGG AAAACCGGCA UCUAUGAAAG 30 25 nucleotides nucleicacid single linear 365 CAAUCAAGAA UUCCGUUUUC AGUCG 25 25 nucleotidesnucleic acid single linear 366 AAGAAUAUCU UCCGAAGCCG AACGG 25 23nucleotides nucleic acid single linear 367 NNNRNNCACC AUCAGGGNNY NNN 2335 nucleotides nucleic acid single linear 368 AGAUGAAGAU ACAGCUCUAGAUGCUGGACA CAUCU 35 33 nucleotides nucleic acid single linear 369UUCGAAUGAU CCCUUAACCG GUUGAUCUAU GAA 33 31 nucleotides nucleic acidsingle linear 370 UGGGCGCAGC GUCAAUGACG CUGACGGUAC A 31 16 nucleotidesnucleic acid single linear 371 NNNNNUKGAG RHACHN 16 13 nucleotidesnucleic acid single linear 372 NDGGMCUCNN NNN 13 14 nucleotides nucleicacid single linear 373 NNNNGHWGAH ACAG 14 12 nucleotides nucleic acidsingle linear 374 CUGGACACNN NN 12

1. A diagnostic composition comprising a nucleic acid ligand.